Pclanglais
commited on
Commit
•
d007f04
1
Parent(s):
023caa5
Update README.md
Browse filesSince i got the official confirmation from Jean Zay :)
README.md
CHANGED
@@ -21,7 +21,9 @@ OCRonos-Vintage is only 124 million parameters. It can run easily on CPU or prov
|
|
21 |
## Training
|
22 |
OCRonos-Vintage was pre-trained from scratch on a dataset of cultural heritage archives from the Library of Congress, Internet Archive and Hathi Trust totalling 18 billion tokens.
|
23 |
|
24 |
-
Pre-training ran on 2 epochs with llm.c (9060 steps total) on 4 H100s for two and a half hour.
|
|
|
|
|
25 |
|
26 |
```bash
|
27 |
srun --ntasks-per-node=4 --gres=gpu:4 ./train_gpt2cu \
|
|
|
21 |
## Training
|
22 |
OCRonos-Vintage was pre-trained from scratch on a dataset of cultural heritage archives from the Library of Congress, Internet Archive and Hathi Trust totalling 18 billion tokens.
|
23 |
|
24 |
+
Pre-training ran on 2 epochs with llm.c (9060 steps total) on 4 H100s for two and a half hour.
|
25 |
+
|
26 |
+
OCRonos-Vintage is the first model trained on the new Jean Zay H100 cluster (compute grant n°GC011015451). We used the following command for training, mostly default hyperparameters, including a short context window of 1,024 tokens.
|
27 |
|
28 |
```bash
|
29 |
srun --ntasks-per-node=4 --gres=gpu:4 ./train_gpt2cu \
|