AmelieSchreiber
/

esm2_t6_8M_binding_sites_finetune_600K

Token Classification

Inference Endpoints

Model card Files Files and versions Community

esm2_t6_8M_binding_sites_finetune_600K / README.md

AmelieSchreiber's picture

AmelieSchreiber

Update README.md

e684670 about 1 year ago

|

history blame contribute delete

1.07 kB

	---
	license: mit
	---

	# ESM-2 Full Finetune for Binding Sites

	This model is a full finetune of ESM-2, to illustrate how full finetuning overfits and generalizes quite poorly compared to
	LoRA and QLoRA finetuning. This model was finetuned on the 600K dataset. We also note that on the 24GB A10 GPU, the batch size
	has to be significantly smaller than when using LoRA or QLoRA. To finetune a similar model, use
	[this script](https://huggingface.co/AmelieSchreiber/esm2_t6_8M_binding_sites_finetune/blob/main/finetune.py).

	## Overfitting

	```python
	Train metrics:

	{'eval_loss': 0.13651661574840546,
	'eval_accuracy': 0.9656322509450104,
	'eval_precision': 0.38616650354104665,
	'eval_recall': 0.9618091516702236,
	'eval_f1': 0.55107594226701,
	'eval_auc': 0.9637635647574605,
	'eval_mcc': 0.5977943918337999}

	Test metrics:

	{'eval_loss': 0.2910114824771881,
	'eval_accuracy': 0.923270649115702,
	'eval_precision': 0.14887069127765168,
	'eval_recall': 0.533511928419524,
	'eval_f1': 0.23278520670392827,
	'eval_auc': 0.7327381144575454,
	'eval_mcc': 0.25329082069818704}
	```