venkatasg
/

lil-bevo-x

Inference Endpoints

Model card Files Files and versions Community

venkatasg commited on Aug 25, 2023

Commit

b2361ea

•

1 Parent(s): 750db0c

Update README.md

Files changed (1) hide show

README.md +5 -6

README.md CHANGED Viewed

@@ -12,12 +12,11 @@ Lil-Bevo-X is UT Austin's submission to the BabyLM challenge, specifically the *
 [Link to GitHub Repo](https://github.com/venkatasg/Lil-Bevo)
-## TLDR:
-- Unigram tokenizer trained on 10M BabyLM tokens plus MAESTRO dataset for a vocab size of 32k.
-- `deberta-base-v3` trained on mixture of MAESTRO and 100M tokens for 3 epochs.
-- Model continues training for 100,000 steps with 128 sequence length.
-- Model continues training for 65,000 steps with 512 sequence length.
-- Model is trained with targeted linguistic masking for 1 epoch.
   This README will be updated with more details soon.

 [Link to GitHub Repo](https://github.com/venkatasg/Lil-Bevo)
+## Model training regime:
+1. 5 epochs on MAESTRO dataset (85M non-language music tokens) combined with strict small dataset.
+2. 50 epochs of pretraining with sequence length of 128 on strict dataset.
+3. 150 epochs of pretraining with sequence length of 512 on strict dataset.
+4. 10 epochs of targeted MLM.
   This README will be updated with more details soon.