AmelieSchreiber
commited on
Commit
•
07c5e2c
1
Parent(s):
3de8059
Update README.md
Browse files
README.md
CHANGED
@@ -8,6 +8,34 @@ This is the 650M parameter version of ESM-2, finetuned with QLoRA to predict bin
|
|
8 |
No multiple sequence alignment or structure is required. The embeddings from this model can also be used in structural models. The model is trained on
|
9 |
approximately 12M protein sequences from UniProt, with an 80/20 train/test split.
|
10 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
## Using the Model
|
12 |
|
13 |
```python
|
|
|
8 |
No multiple sequence alignment or structure is required. The embeddings from this model can also be used in structural models. The model is trained on
|
9 |
approximately 12M protein sequences from UniProt, with an 80/20 train/test split.
|
10 |
|
11 |
+
## Metrics
|
12 |
+
|
13 |
+
Based on preliminary tests on 20% samples of the train test split, the metrics are as follows:
|
14 |
+
|
15 |
+
### Train Metrics
|
16 |
+
|
17 |
+
```python
|
18 |
+
'eval_loss': 0.05597764626145363,
|
19 |
+
'eval_accuracy': 0.9829392036087405,
|
20 |
+
'eval_precision': 0.5626191259397847,
|
21 |
+
'eval_recall': 0.9488112528941492,
|
22 |
+
'eval_f1': 0.7063763773187873,
|
23 |
+
'eval_auc': 0.9662524626230765,
|
24 |
+
'eval_mcc': 0.7235838533979579
|
25 |
+
```
|
26 |
+
|
27 |
+
### Test Metrics
|
28 |
+
|
29 |
+
```python
|
30 |
+
'eval_loss': 0.16281947493553162,
|
31 |
+
'eval_accuracy': 0.9569658774883986,
|
32 |
+
'eval_precision': 0.3209956738348438,
|
33 |
+
'eval_recall': 0.7883697002335764,
|
34 |
+
'eval_f1': 0.4562306866120791,
|
35 |
+
'eval_auc': 0.8746433990040084,
|
36 |
+
'eval_mcc': 0.48648765699020435
|
37 |
+
```
|
38 |
+
|
39 |
## Using the Model
|
40 |
|
41 |
```python
|