AmelieSchreiber
/

esm2_t6_8m_qlora_binding_sites_v0

Model card Files Files and versions Community

AmelieSchreiber commited on Oct 1, 2023

Commit

16bb25e

•

1 Parent(s): b79663e

Update README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -51,7 +51,8 @@ trainable params: 23682 || all params: 4075265 || trainable%: 0.5811155838945443
 It was shown in the QLoRA paper that to obtain performance comparable to or better than full finetuning, the most important hyperparameter than can
 that can be adjusted is which weight matrices the LoRA adapters are applied to, with more being better. The rank and other hyperparameters
 such as the scaling factor alpha did not seem to matter. So, an important thing to investigate next would be to check and see if this
-transfers to protein language models as well.
 ## Testing for Overfitting

 It was shown in the QLoRA paper that to obtain performance comparable to or better than full finetuning, the most important hyperparameter than can
 that can be adjusted is which weight matrices the LoRA adapters are applied to, with more being better. The rank and other hyperparameters
 such as the scaling factor alpha did not seem to matter. So, an important thing to investigate next would be to check and see if this
+transfers to protein language models as well. A general pattern showing that overfitting is improved by adding in adapters for more of the
+weight matrices is emerging, so more adapter layers seems to be better in that regard as well.
 ## Testing for Overfitting