AmelieSchreiber
commited on
Commit
•
16bb25e
1
Parent(s):
b79663e
Update README.md
Browse files
README.md
CHANGED
@@ -51,7 +51,8 @@ trainable params: 23682 || all params: 4075265 || trainable%: 0.5811155838945443
|
|
51 |
It was shown in the QLoRA paper that to obtain performance comparable to or better than full finetuning, the most important hyperparameter than can
|
52 |
that can be adjusted is which weight matrices the LoRA adapters are applied to, with more being better. The rank and other hyperparameters
|
53 |
such as the scaling factor alpha did not seem to matter. So, an important thing to investigate next would be to check and see if this
|
54 |
-
transfers to protein language models as well.
|
|
|
55 |
|
56 |
## Testing for Overfitting
|
57 |
|
|
|
51 |
It was shown in the QLoRA paper that to obtain performance comparable to or better than full finetuning, the most important hyperparameter than can
|
52 |
that can be adjusted is which weight matrices the LoRA adapters are applied to, with more being better. The rank and other hyperparameters
|
53 |
such as the scaling factor alpha did not seem to matter. So, an important thing to investigate next would be to check and see if this
|
54 |
+
transfers to protein language models as well. A general pattern showing that overfitting is improved by adding in adapters for more of the
|
55 |
+
weight matrices is emerging, so more adapter layers seems to be better in that regard as well.
|
56 |
|
57 |
## Testing for Overfitting
|
58 |
|