pretrain eval
Browse files
README.md
CHANGED
@@ -44,15 +44,17 @@ This model **isn't** designed for immediate use but rather for Continued Pretrai
|
|
44 |
|
45 |
The objective is to streamline the cognitive or reasoning core, eliminating any redundant knowledge from the model.
|
46 |
|
47 |
-
[loss, val_loss]()
|
48 |
|
49 |
-
[val_ppl]()
|
50 |
|
51 |
-
[epoch]()
|
52 |
|
53 |
-
[learning_rate]()
|
54 |
|
55 |
-
##
|
|
|
|
|
56 |
|
57 |
```bash
|
58 |
litgpt evaluate --tasks 'hellaswag,gsm8k,truthfulqa_mc2,mmlu,winogrande,arc_challenge' --out_dir 'evaluate-quick/' --batch_size 4 --dtype 'bfloat16' out/pretrain/final/
|
@@ -256,10 +258,6 @@ litgpt evaluate --tasks 'arc_challenge,boolq,gpqa,hellaswag,openbookqa,piqa,trut
|
|
256 |
|truthfulqa_mc2 | 2|none | 0|acc |↑ |0.5061|± |0.0167|
|
257 |
|winogrande | 1|none | 0|acc |↑ |0.4933|± |0.0141|
|
258 |
|
259 |
-
```bash
|
260 |
-
litgpt evaluate --tasks 'mmlu_multilingual,mgsm' --out_dir 'evaluate-multilinguals/' --batch_size 4 --dtype 'bfloat16' out/pretrain/final/
|
261 |
-
```
|
262 |
-
|
263 |
```bash
|
264 |
litgpt evaluate --tasks 'wikitext,qasper' --out_dir 'evaluate-long/' --batch_size 4 --dtype 'bfloat16' out/pretrain/final/
|
265 |
```
|
|
|
44 |
|
45 |
The objective is to streamline the cognitive or reasoning core, eliminating any redundant knowledge from the model.
|
46 |
|
47 |
+
[loss, val_loss](https://api.wandb.ai/links/mtasic85/strnx9rl)
|
48 |
|
49 |
+
[val_ppl](https://api.wandb.ai/links/mtasic85/ljwxf4am)
|
50 |
|
51 |
+
[epoch](https://api.wandb.ai/links/mtasic85/edyph869)
|
52 |
|
53 |
+
[learning_rate](https://api.wandb.ai/links/mtasic85/eswxyger)
|
54 |
|
55 |
+
## Pretrain Evaluation
|
56 |
+
|
57 |
+
### lm-evaluation-harness
|
58 |
|
59 |
```bash
|
60 |
litgpt evaluate --tasks 'hellaswag,gsm8k,truthfulqa_mc2,mmlu,winogrande,arc_challenge' --out_dir 'evaluate-quick/' --batch_size 4 --dtype 'bfloat16' out/pretrain/final/
|
|
|
258 |
|truthfulqa_mc2 | 2|none | 0|acc |↑ |0.5061|± |0.0167|
|
259 |
|winogrande | 1|none | 0|acc |↑ |0.4933|± |0.0141|
|
260 |
|
|
|
|
|
|
|
|
|
261 |
```bash
|
262 |
litgpt evaluate --tasks 'wikitext,qasper' --out_dir 'evaluate-long/' --batch_size 4 --dtype 'bfloat16' out/pretrain/final/
|
263 |
```
|