Label Semantics:
Label 0: Non-crystallizable (Negative)
Label 1: Crystallizable (Positive)
Dataset
Model
ESMCrystal_t12_35M_v2
ESMCrystal_t12_35M_v2 is a state-of-the-art protein crystallization prediction model finetuned on esm2_t12_35M_UR50D, having 12 layers and 35M parameters with size of approx. 136MB using transfer learning to predict whether an input protein sequence will crystallize or not.
Accuracy :
Dataset | Accuracy |
---|---|
DeepCrystal Test | 0.8161222339304531 |
BCrystal test | 0.8052602126468943 |
SP test | 0.7637130801687764 |
TR test | 0.8389328063241107 |
Comparision Table:
Dataset | Count | Positives | Negatives | TP | FP | FN | TN | Precision | Recall | F1 | Accuracy | ROC | Mathew's Coefficient | PPV | NPV |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
DeepCrystalTest | 1898 | 898 | 1000 | 579 | 319 | 30 | 970 | 0.64476615 | 0.95073892 | 0.76841407 | 0.81612223 | 0.9403 | 0.657526117 | 0.64476615 | 0.97 |
BCrystal Test | 1787 | 891 | 896 | 573 | 318 | 30 | 866 | 0.64309764 | 0.95024876 | 0.76706827 | 0.80526021 | 0.9396 | 0.644635696 | 0.64309764 | 0.96651786 |
SP Test | 237 | 148 | 89 | 97 | 51 | 5 | 84 | 0.65540541 | 0.95098039 | 0.776 | 0.76371308 | 0.9293 | 0.586069704 | 0.65540541 | 0.94382022 |
TR Test | 1012 | 374 | 638 | 225 | 149 | 14 | 624 | 0.60160428 | 0.94142259 | 0.73409462 | 0.83893281 | 0.9562 | 0.658766192 | 0.60160428 | 0.97805643 |
Graphs
ROC-AUC Curve
PR-AUC Curve
Final scores :
- on DeepCrystal test:
precision | recall | f1-score | support | |
---|---|---|---|---|
non-crystallizable | 0.75 | 0.97 | 0.85 | 1000 |
crystallizable | 0.95 | 0.64 | 0.77 | 898 |
accuracy | 0.82 | 1898 | ||
macro avg | 0.85 | 0.81 | 0.81 | 1898 |
weighted avg | 0.85 | 0.82 | 0.81 | 1898 |
- on BCrystal test:
precision | recall | f1-score | support | |
---|---|---|---|---|
non-crystallizable | 0.73 | 0.97 | 0.83 | 896 |
crystallizable | 0.95 | 0.64 | 0.77 | 891 |
accuracy | 0.81 | 1787 | ||
macro avg | 0.84 | 0.80 | 0.80 | 1787 |
weighted avg | 0.84 | 0.81 | 0.80 | 1787 |
- on SP test:
precision | recall | f1-score | support | |
---|---|---|---|---|
non-crystallizable | 0.62 | 0.94 | 0.75 | 89 |
crystallizable | 0.95 | 0.66 | 0.78 | 148 |
accuracy | 0.76 | 237 | ||
macro avg | 0.79 | 0.80 | 0.76 | 237 |
weighted avg | 0.83 | 0.76 | 0.77 | 237 |
- on TR test:
precision | recall | f1-score | support | |
---|---|---|---|---|
non-crystallizable | 0.81 | 0.98 | 0.88 | 638 |
crystallizable | 0.94 | 0.60 | 0.73 | 374 |
accuracy | 0.84 | 1012 | ||
macro avg | 0.87 | 0.79 | 0.81 | 1012 |
weighted avg | 0.86 | 0.84 | 0.83 | 1012 |
Confusion matrix:
- on DeepCrystal test:
| 579 | 319 |
| 30 | 970 |
- on BCrystal test:
| 573 | 318 |
| 30 | 866 |
- on SP test:
| 97 | 51 |
| 5 | 84 |
- on TR test:
| 225 | 149 |
| 14 | 624 |
Metrics
roc score:
on DeepCrystal test: 0.9403474387527841
on BCrystal test: 0.9395705567580568
on SP test: 0.9293197692074097
on TR test: 0.9561924798417515
Mathews Coefficient:
on DeepCrystal test: 0.6575261170551334
on BCrystal test: 0.6446356961702661
on SP test: 0.586069703866632
on TR test: 0.6587661924247377
NPV:
on DeepCrystal test: 0.97
on BCrystal test: 0.9665178571428571
on SP test: 0.9438202247191011
on TR test: 0.9780564263322884
PPV:
on DeepCrystal test: 0.6447661469933185
on BCrystal test: 0.6430976430976431
on SP test: 0.6554054054054054
on TR test: 0.6016042780748663
Researchers:
Credits:
- Downloads last month
- 29