|
--- |
|
license: mit |
|
--- |
|
|
|
# ESM-2 Full Finetune for Binding Sites |
|
|
|
This model is a full finetune of ESM-2, to illustrate how full finetuning overfits and generalizes quite poorly compared to |
|
LoRA and QLoRA finetuning. This model was finetuned on the 600K dataset. We also note that on the 24GB A10 GPU, the batch size |
|
has to be significantly smaller than when using LoRA or QLoRA. To finetune a similar model, use |
|
[this script](https://huggingface.co/AmelieSchreiber/esm2_t6_8M_binding_sites_finetune/blob/main/finetune.py). |
|
|
|
## Overfitting |
|
|
|
```python |
|
Train metrics: |
|
|
|
{'eval_loss': 0.13651661574840546, |
|
'eval_accuracy': 0.9656322509450104, |
|
'eval_precision': 0.38616650354104665, |
|
'eval_recall': 0.9618091516702236, |
|
'eval_f1': 0.55107594226701, |
|
'eval_auc': 0.9637635647574605, |
|
'eval_mcc': 0.5977943918337999} |
|
|
|
Test metrics: |
|
|
|
{'eval_loss': 0.2910114824771881, |
|
'eval_accuracy': 0.923270649115702, |
|
'eval_precision': 0.14887069127765168, |
|
'eval_recall': 0.533511928419524, |
|
'eval_f1': 0.23278520670392827, |
|
'eval_auc': 0.7327381144575454, |
|
'eval_mcc': 0.25329082069818704} |
|
``` |