cleanup

Browse files

Files changed (9) hide show

README.md +115 -50
out/pretrain/final/evaluate/config.json +0 -3
out/pretrain/final/evaluate/model_config.yaml +0 -33
out/pretrain/final/evaluate/pytorch_model.bin +0 -3
out/pretrain/final/evaluate/results.json +0 -0
out/pretrain/final/evaluate/tokenizer.json +0 -3
out/pretrain/final/evaluate/tokenizer_config.json +0 -3
scripts/TRAIN.md +4 -2
scripts/requirements-lit.in +1 -1

README.md CHANGED Viewed

@@ -26,6 +26,12 @@ tags:
 ![logo](./misc/logo.png)
 [loss, val_loss](https://api.wandb.ai/links/mtasic85/m1gxynpw)
 [val_ppl](https://api.wandb.ai/links/mtasic85/xs21d6u2)
@@ -36,83 +42,142 @@ tags:
 ## lm-evaluation-harness
-|                 Tasks                 |Version|     Filter     |n-shot|  Metric   |   |Value |   |Stderr|
-|---------------------------------------|------:|----------------|-----:|-----------|---|-----:|---|-----:|
-|arc_challenge                          |      1|none            |     0|acc        |↑  |0.1937|±  |0.0115|
-|                                       |       |none            |     0|acc_norm   |↑  |0.2363|±  |0.0124|
-|gsm8k                                  |      3|flexible-extract|     5|exact_match|↑  |0.0136|±  |0.0032|
-|                                       |       |strict-match    |     5|exact_match|↑  |0.0000|±  |0.0000|
-|hellaswag                              |      1|none            |     0|acc        |↑  |0.2659|±  |0.0044|
-|                                       |       |none            |     0|acc_norm   |↑  |0.2709|±  |0.0044|
-|mmlu                                   |      2|none            |      |acc        |↑  |0.2309|±  |0.0036|
-| - humanities                          |      2|none            |      |acc        |↑  |0.2370|±  |0.0062|
-|  - formal_logic                       |      1|none            |     0|acc        |↑  |0.2778|±  |0.0401|
-|  - high_school_european_history       |      1|none            |     0|acc        |↑  |0.2303|±  |0.0329|
-|  - high_school_us_history             |      1|none            |     0|acc        |↑  |0.2402|±  |0.0300|
-|  - high_school_world_history          |      1|none            |     0|acc        |↑  |0.2405|±  |0.0278|
 |  - international_law                  |      1|none            |     0|acc        |↑  |0.1983|±  |0.0364|
 |  - jurisprudence                      |      1|none            |     0|acc        |↑  |0.2315|±  |0.0408|
 |  - logical_fallacies                  |      1|none            |     0|acc        |↑  |0.1840|±  |0.0304|
 |  - moral_disputes                     |      1|none            |     0|acc        |↑  |0.2110|±  |0.0220|
 |  - moral_scenarios                    |      1|none            |     0|acc        |↑  |0.2380|±  |0.0142|
-|  - philosophy                         |      1|none            |     0|acc        |↑  |0.1994|±  |0.0227|
 |  - prehistory                         |      1|none            |     0|acc        |↑  |0.2315|±  |0.0235|
-|  - professional_law                   |      1|none            |     0|acc        |↑  |0.2510|±  |0.0111|
 |  - world_religions                    |      1|none            |     0|acc        |↑  |0.2865|±  |0.0347|
-| - other                               |      2|none            |      |acc        |↑  |0.2372|±  |0.0076|
 |  - business_ethics                    |      1|none            |     0|acc        |↑  |0.2900|±  |0.0456|
 |  - clinical_knowledge                 |      1|none            |     0|acc        |↑  |0.2113|±  |0.0251|
-|  - college_medicine                   |      1|none            |     0|acc        |↑  |0.2023|±  |0.0306|
 |  - global_facts                       |      1|none            |     0|acc        |↑  |0.1900|±  |0.0394|
 |  - human_aging                        |      1|none            |     0|acc        |↑  |0.3004|±  |0.0308|
 |  - management                         |      1|none            |     0|acc        |↑  |0.1748|±  |0.0376|
 |  - marketing                          |      1|none            |     0|acc        |↑  |0.2863|±  |0.0296|
-|  - medical_genetics                   |      1|none            |     0|acc        |↑  |0.2700|±  |0.0446|
-|  - miscellaneous                      |      1|none            |     0|acc        |↑  |0.2337|±  |0.0151|
 |  - nutrition                          |      1|none            |     0|acc        |↑  |0.2255|±  |0.0239|
-|  - professional_accounting            |      1|none            |     0|acc        |↑  |0.2411|±  |0.0255|
 |  - professional_medicine              |      1|none            |     0|acc        |↑  |0.1985|±  |0.0242|
-|  - virology                           |      1|none            |     0|acc        |↑  |0.2711|±  |0.0346|
-| - social sciences                     |      2|none            |      |acc        |↑  |0.2278|±  |0.0076|
 |  - econometrics                       |      1|none            |     0|acc        |↑  |0.2105|±  |0.0384|
-|  - high_school_geography              |      1|none            |     0|acc        |↑  |0.1768|±  |0.0272|
 |  - high_school_government_and_politics|      1|none            |     0|acc        |↑  |0.2280|±  |0.0303|
-|  - high_school_macroeconomics         |      1|none            |     0|acc        |↑  |0.2436|±  |0.0218|
-|  - high_school_microeconomics         |      1|none            |     0|acc        |↑  |0.2395|±  |0.0277|
-|  - high_school_psychology             |      1|none            |     0|acc        |↑  |0.2037|±  |0.0173|
 |  - human_sexuality                    |      1|none            |     0|acc        |↑  |0.2595|±  |0.0384|
-|  - professional_psychology            |      1|none            |     0|acc        |↑  |0.2386|±  |0.0172|
 |  - public_relations                   |      1|none            |     0|acc        |↑  |0.2091|±  |0.0390|
-|  - security_studies                   |      1|none            |     0|acc        |↑  |0.2490|±  |0.0277|
-|  - sociology                          |      1|none            |     0|acc        |↑  |0.1990|±  |0.0282|
 |  - us_foreign_policy                  |      1|none            |     0|acc        |↑  |0.3100|±  |0.0465|
-| - stem                                |      2|none            |      |acc        |↑  |0.2185|±  |0.0074|
-|  - abstract_algebra                   |      1|none            |     0|acc        |↑  |0.2600|±  |0.0441|
 |  - anatomy                            |      1|none            |     0|acc        |↑  |0.1630|±  |0.0319|
-|  - astronomy                          |      1|none            |     0|acc        |↑  |0.2237|±  |0.0339|
-|  - college_biology                    |      1|none            |     0|acc        |↑  |0.2708|±  |0.0372|
-|  - college_chemistry                  |      1|none            |     0|acc        |↑  |0.2300|±  |0.0423|
 |  - college_computer_science           |      1|none            |     0|acc        |↑  |0.2100|±  |0.0409|
-|  - college_mathematics                |      1|none            |     0|acc        |↑  |0.2200|±  |0.0416|
-|  - college_physics                    |      1|none            |     0|acc        |↑  |0.2647|±  |0.0439|
 |  - computer_security                  |      1|none            |     0|acc        |↑  |0.3000|±  |0.0461|
-|  - conceptual_physics                 |      1|none            |     0|acc        |↑  |0.2000|±  |0.0261|
-|  - electrical_engineering             |      1|none            |     0|acc        |↑  |0.2345|±  |0.0353|
 |  - elementary_mathematics             |      1|none            |     0|acc        |↑  |0.2302|±  |0.0217|
-|  - high_school_biology                |      1|none            |     0|acc        |↑  |0.1903|±  |0.0223|
 |  - high_school_chemistry              |      1|none            |     0|acc        |↑  |0.1527|±  |0.0253|
-|  - high_school_computer_science       |      1|none            |     0|acc        |↑  |0.2700|±  |0.0446|
-|  - high_school_mathematics            |      1|none            |     0|acc        |↑  |0.1926|±  |0.0240|
 |  - high_school_physics                |      1|none            |     0|acc        |↑  |0.2053|±  |0.0330|
-|  - high_school_statistics             |      1|none            |     0|acc        |↑  |0.2130|±  |0.0279|
 |  - machine_learning                   |      1|none            |     0|acc        |↑  |0.2768|±  |0.0425|
-|truthfulqa_mc2                         |      2|none            |     0|acc        |↑  |0.4683|±  |0.0160|
-|winogrande                             |      1|none            |     0|acc        |↑  |0.5075|±  |0.0141|
 |      Groups      |Version|Filter|n-shot|Metric|   |Value |   |Stderr|
 |------------------|------:|------|------|------|---|-----:|---|-----:|
-|mmlu              |      2|none  |      |acc   |↑  |0.2309|±  |0.0036|
-| - humanities     |      2|none  |      |acc   |↑  |0.2370|±  |0.0062|
-| - other          |      2|none  |      |acc   |↑  |0.2372|±  |0.0076|
-| - social sciences|      2|none  |      |acc   |↑  |0.2278|±  |0.0076|
-| - stem           |      2|none  |      |acc   |↑  |0.2185|±  |0.0074|

 ![logo](./misc/logo.png)
+A pretrained language model based on the Llama model with about **108M** parameters. This model has been trained on **9.7B** (`9,782,206,713`) tokens from more than **5.2M** (`5,285,575`) dataset rows.
+This model **isn't** designed for immediate use but rather for Continued Pretraining and Finetuning on a downstream task. While it can handle a context length of up to **32K** (`32,768`) tokens, it was pretrained with sequences of **2K** (`2048`) tokens.
+The objective is to streamline the cognitive or reasoning core, eliminating any redundant knowledge from the model.
 [loss, val_loss](https://api.wandb.ai/links/mtasic85/m1gxynpw)
 [val_ppl](https://api.wandb.ai/links/mtasic85/xs21d6u2)
 ## lm-evaluation-harness
+```bash
+litgpt evaluate --tasks 'leaderboard' --out_dir 'evaluate-0/' --batch_size 4 --dtype 'bfloat16' out/pretrain/final/
+```
+|                           Tasks                           |Version|Filter|n-shot|        Metric         |   |Value |   |Stderr|
+|-----------------------------------------------------------|-------|------|-----:|-----------------------|---|-----:|---|------|
+|leaderboard                                                |    N/A|      |      |                       |   |      |   |      |
+| - leaderboard_bbh                                         |    N/A|      |      |                       |   |      |   |      |
+|  - leaderboard_bbh_boolean_expressions                    |      1|none  |     3|acc_norm               |↑  |0.5040|±  |0.0317|
+|  - leaderboard_bbh_causal_judgement                       |      1|none  |     3|acc_norm               |↑  |0.5134|±  |0.0366|
+|  - leaderboard_bbh_date_understanding                     |      1|none  |     3|acc_norm               |↑  |0.1880|±  |0.0248|
+|  - leaderboard_bbh_disambiguation_qa                      |      1|none  |     3|acc_norm               |↑  |0.3000|±  |0.0290|
+|  - leaderboard_bbh_formal_fallacies                       |      1|none  |     3|acc_norm               |↑  |0.4680|±  |0.0316|
+|  - leaderboard_bbh_geometric_shapes                       |      1|none  |     3|acc_norm               |↑  |0.0560|±  |0.0146|
+|  - leaderboard_bbh_hyperbaton                             |      1|none  |     3|acc_norm               |↑  |0.4960|±  |0.0317|
+|  - leaderboard_bbh_logical_deduction_five_objects         |      1|none  |     3|acc_norm               |↑  |0.1680|±  |0.0237|
+|  - leaderboard_bbh_logical_deduction_seven_objects        |      1|none  |     3|acc_norm               |↑  |0.1400|±  |0.0220|
+|  - leaderboard_bbh_logical_deduction_three_objects        |      1|none  |     3|acc_norm               |↑  |0.3360|±  |0.0299|
+|  - leaderboard_bbh_movie_recommendation                   |      1|none  |     3|acc_norm               |↑  |0.2280|±  |0.0266|
+|  - leaderboard_bbh_navigate                               |      1|none  |     3|acc_norm               |↑  |0.4720|±  |0.0316|
+|  - leaderboard_bbh_object_counting                        |      1|none  |     3|acc_norm               |↑  |0.1000|±  |0.0190|
+|  - leaderboard_bbh_penguins_in_a_table                    |      1|none  |     3|acc_norm               |↑  |0.1918|±  |0.0327|
+|  - leaderboard_bbh_reasoning_about_colored_objects        |      1|none  |     3|acc_norm               |↑  |0.1480|±  |0.0225|
+|  - leaderboard_bbh_ruin_names                             |      1|none  |     3|acc_norm               |↑  |0.2920|±  |0.0288|
+|  - leaderboard_bbh_salient_translation_error_detection    |      1|none  |     3|acc_norm               |↑  |0.1520|±  |0.0228|
+|  - leaderboard_bbh_snarks                                 |      1|none  |     3|acc_norm               |↑  |0.4494|±  |0.0374|
+|  - leaderboard_bbh_sports_understanding                   |      1|none  |     3|acc_norm               |↑  |0.4600|±  |0.0316|
+|  - leaderboard_bbh_temporal_sequences                     |      1|none  |     3|acc_norm               |↑  |0.2480|±  |0.0274|
+|  - leaderboard_bbh_tracking_shuffled_objects_five_objects |      1|none  |     3|acc_norm               |↑  |0.1920|±  |0.0250|
+|  - leaderboard_bbh_tracking_shuffled_objects_seven_objects|      1|none  |     3|acc_norm               |↑  |0.1560|±  |0.0230|
+|  - leaderboard_bbh_tracking_shuffled_objects_three_objects|      1|none  |     3|acc_norm               |↑  |0.3000|±  |0.0290|
+|  - leaderboard_bbh_web_of_lies                            |      1|none  |     3|acc_norm               |↑  |0.5040|±  |0.0317|
+| - leaderboard_gpqa                                        |    N/A|      |      |                       |   |      |   |      |
+|  - leaderboard_gpqa_diamond                               |      1|none  |     0|acc_norm               |↑  |0.2222|±  |0.0296|
+|  - leaderboard_gpqa_extended                              |      1|none  |     0|acc_norm               |↑  |0.2711|±  |0.0190|
+|  - leaderboard_gpqa_main                                  |      1|none  |     0|acc_norm               |↑  |0.2589|±  |0.0207|
+| - leaderboard_ifeval                                      |      3|none  |     0|inst_level_loose_acc   |↑  |0.2050|±  |   N/A|
+|                                                           |       |none  |     0|inst_level_strict_acc  |↑  |0.1966|±  |   N/A|
+|                                                           |       |none  |     0|prompt_level_loose_acc |↑  |0.1072|±  |0.0133|
+|                                                           |       |none  |     0|prompt_level_strict_acc|↑  |0.1035|±  |0.0131|
+| - leaderboard_math_hard                                   |    N/A|      |      |                       |   |      |   |      |
+|  - leaderboard_math_algebra_hard                          |      1|none  |     4|exact_match            |↑  |0.0000|±  |     0|
+|  - leaderboard_math_counting_and_prob_hard                |      1|none  |     4|exact_match            |↑  |0.0000|±  |     0|
+|  - leaderboard_math_geometry_hard                         |      1|none  |     4|exact_match            |↑  |0.0000|±  |     0|
+|  - leaderboard_math_intermediate_algebra_hard             |      1|none  |     4|exact_match            |↑  |0.0000|±  |     0|
+|  - leaderboard_math_num_theory_hard                       |      1|none  |     4|exact_match            |↑  |0.0000|±  |     0|
+|  - leaderboard_math_prealgebra_hard                       |      1|none  |     4|exact_match            |↑  |0.0000|±  |     0|
+|  - leaderboard_math_precalculus_hard                      |      1|none  |     4|exact_match            |↑  |0.0000|±  |     0|
+| - leaderboard_mmlu_pro                                    |    0.1|none  |     5|acc                    |↑  |0.1151|±  |0.0029|
+| - leaderboard_musr                                        |    N/A|      |      |                       |   |      |   |      |
+|  - leaderboard_musr_murder_mysteries                      |      1|none  |     0|acc_norm               |↑  |0.4840|±  |0.0317|
+|  - leaderboard_musr_object_placements                     |      1|none  |     0|acc_norm               |↑  |0.3125|±  |0.0290|
+|  - leaderboard_musr_team_allocation                       |      1|none  |     0|acc_norm               |↑  |0.3840|±  |0.0308|
+```bash
+litgpt evaluate --tasks 'hellaswag,gsm8k,truthfulqa_mc2,mmlu,winogrande,arc_challenge' --out_dir 'evaluate-1/' --batch_size 4 --dtype 'bfloat16' out/pretrain/final/
+```
+|                 Tasks                 |Version|     Filter     |n-shot|  Metric   |   |Value |   |Stderr|
+|---------------------------------------|------:|----------------|-----:|-----------|---|-----:|---|-----:|
+|arc_challenge                          |      1|none            |     0|acc        |↑  |0.1911|±  |0.0115|
+|                                       |       |none            |     0|acc_norm   |↑  |0.2355|±  |0.0124|
+|gsm8k                                  |      3|flexible-extract|     5|exact_match|↑  |0.0152|±  |0.0034|
+|                                       |       |strict-match    |     5|exact_match|↑  |0.0000|±  |0.0000|
+|hellaswag                              |      1|none            |     0|acc        |↑  |0.2661|±  |0.0044|
+|                                       |       |none            |     0|acc_norm   |↑  |0.2708|±  |0.0044|
+|mmlu                                   |      2|none            |      |acc        |↑  |0.2315|±  |0.0036|
+| - humanities                          |      2|none            |      |acc        |↑  |0.2372|±  |0.0062|
+|  - formal_logic                       |      1|none            |     0|acc        |↑  |0.2937|±  |0.0407|
+|  - high_school_european_history       |      1|none            |     0|acc        |↑  |0.2424|±  |0.0335|
+|  - high_school_us_history             |      1|none            |     0|acc        |↑  |0.2451|±  |0.0302|
+|  - high_school_world_history          |      1|none            |     0|acc        |↑  |0.2321|±  |0.0275|
 |  - international_law                  |      1|none            |     0|acc        |↑  |0.1983|±  |0.0364|
 |  - jurisprudence                      |      1|none            |     0|acc        |↑  |0.2315|±  |0.0408|
 |  - logical_fallacies                  |      1|none            |     0|acc        |↑  |0.1840|±  |0.0304|
 |  - moral_disputes                     |      1|none            |     0|acc        |↑  |0.2110|±  |0.0220|
 |  - moral_scenarios                    |      1|none            |     0|acc        |↑  |0.2380|±  |0.0142|
+|  - philosophy                         |      1|none            |     0|acc        |↑  |0.1961|±  |0.0226|
 |  - prehistory                         |      1|none            |     0|acc        |↑  |0.2315|±  |0.0235|
+|  - professional_law                   |      1|none            |     0|acc        |↑  |0.2503|±  |0.0111|
 |  - world_religions                    |      1|none            |     0|acc        |↑  |0.2865|±  |0.0347|
+| - other                               |      2|none            |      |acc        |↑  |0.2385|±  |0.0076|
 |  - business_ethics                    |      1|none            |     0|acc        |↑  |0.2900|±  |0.0456|
 |  - clinical_knowledge                 |      1|none            |     0|acc        |↑  |0.2113|±  |0.0251|
+|  - college_medicine                   |      1|none            |     0|acc        |↑  |0.1965|±  |0.0303|
 |  - global_facts                       |      1|none            |     0|acc        |↑  |0.1900|±  |0.0394|
 |  - human_aging                        |      1|none            |     0|acc        |↑  |0.3004|±  |0.0308|
 |  - management                         |      1|none            |     0|acc        |↑  |0.1748|±  |0.0376|
 |  - marketing                          |      1|none            |     0|acc        |↑  |0.2863|±  |0.0296|
+|  - medical_genetics                   |      1|none            |     0|acc        |↑  |0.2800|±  |0.0451|
+|  - miscellaneous                      |      1|none            |     0|acc        |↑  |0.2350|±  |0.0152|
 |  - nutrition                          |      1|none            |     0|acc        |↑  |0.2255|±  |0.0239|
+|  - professional_accounting            |      1|none            |     0|acc        |↑  |0.2482|±  |0.0258|
 |  - professional_medicine              |      1|none            |     0|acc        |↑  |0.1985|±  |0.0242|
+|  - virology                           |      1|none            |     0|acc        |↑  |0.2771|±  |0.0348|
+| - social sciences                     |      2|none            |      |acc        |↑  |0.2281|±  |0.0076|
 |  - econometrics                       |      1|none            |     0|acc        |↑  |0.2105|±  |0.0384|
+|  - high_school_geography              |      1|none            |     0|acc        |↑  |0.1818|±  |0.0275|
 |  - high_school_government_and_politics|      1|none            |     0|acc        |↑  |0.2280|±  |0.0303|
+|  - high_school_macroeconomics         |      1|none            |     0|acc        |↑  |0.2410|±  |0.0217|
+|  - high_school_microeconomics         |      1|none            |     0|acc        |↑  |0.2353|±  |0.0276|
+|  - high_school_psychology             |      1|none            |     0|acc        |↑  |0.2055|±  |0.0173|
 |  - human_sexuality                    |      1|none            |     0|acc        |↑  |0.2595|±  |0.0384|
+|  - professional_psychology            |      1|none            |     0|acc        |↑  |0.2418|±  |0.0173|
 |  - public_relations                   |      1|none            |     0|acc        |↑  |0.2091|±  |0.0390|
+|  - security_studies                   |      1|none            |     0|acc        |↑  |0.2408|±  |0.0274|
+|  - sociology                          |      1|none            |     0|acc        |↑  |0.2040|±  |0.0285|
 |  - us_foreign_policy                  |      1|none            |     0|acc        |↑  |0.3100|±  |0.0465|
+| - stem                                |      2|none            |      |acc        |↑  |0.2195|±  |0.0074|
+|  - abstract_algebra                   |      1|none            |     0|acc        |↑  |0.2700|±  |0.0446|
 |  - anatomy                            |      1|none            |     0|acc        |↑  |0.1630|±  |0.0319|
+|  - astronomy                          |      1|none            |     0|acc        |↑  |0.2303|±  |0.0343|
+|  - college_biology                    |      1|none            |     0|acc        |↑  |0.2569|±  |0.0365|
+|  - college_chemistry                  |      1|none            |     0|acc        |↑  |0.2400|±  |0.0429|
 |  - college_computer_science           |      1|none            |     0|acc        |↑  |0.2100|±  |0.0409|
+|  - college_mathematics                |      1|none            |     0|acc        |↑  |0.2100|±  |0.0409|
+|  - college_physics                    |      1|none            |     0|acc        |↑  |0.2745|±  |0.0444|
 |  - computer_security                  |      1|none            |     0|acc        |↑  |0.3000|±  |0.0461|
+|  - conceptual_physics                 |      1|none            |     0|acc        |↑  |0.1957|±  |0.0259|
+|  - electrical_engineering             |      1|none            |     0|acc        |↑  |0.2276|±  |0.0349|
 |  - elementary_mathematics             |      1|none            |     0|acc        |↑  |0.2302|±  |0.0217|
+|  - high_school_biology                |      1|none            |     0|acc        |↑  |0.1968|±  |0.0226|
 |  - high_school_chemistry              |      1|none            |     0|acc        |↑  |0.1527|±  |0.0253|
+|  - high_school_computer_science       |      1|none            |     0|acc        |↑  |0.2500|±  |0.0435|
+|  - high_school_mathematics            |      1|none            |     0|acc        |↑  |0.1963|±  |0.0242|
 |  - high_school_physics                |      1|none            |     0|acc        |↑  |0.2053|±  |0.0330|
+|  - high_school_statistics             |      1|none            |     0|acc        |↑  |0.2269|±  |0.0286|
 |  - machine_learning                   |      1|none            |     0|acc        |↑  |0.2768|±  |0.0425|
+|truthfulqa_mc2                         |      2|none            |     0|acc        |↑  |0.4681|±  |0.0159|
+|winogrande                             |      1|none            |     0|acc        |↑  |0.5146|±  |0.0140|
 |      Groups      |Version|Filter|n-shot|Metric|   |Value |   |Stderr|
 |------------------|------:|------|------|------|---|-----:|---|-----:|
+|mmlu              |      2|none  |      |acc   |↑  |0.2315|±  |0.0036|
+| - humanities     |      2|none  |      |acc   |↑  |0.2372|±  |0.0062|
+| - other          |      2|none  |      |acc   |↑  |0.2385|±  |0.0076|
+| - social sciences|      2|none  |      |acc   |↑  |0.2281|±  |0.0076|
+| - stem           |      2|none  |      |acc   |↑  |0.2195|±  |0.0074|

out/pretrain/final/evaluate/config.json DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:d68796194460465eef693c3a2a4dfc2df61655c8a7e8c9119ac22c07bd9e7f27
-size 547

out/pretrain/final/evaluate/model_config.yaml DELETED Viewed

@@ -1,33 +0,0 @@
-attention_logit_softcapping: null
-attention_scores_scalar: null
-bias: false
-block_size: 32768
-final_logit_softcapping: null
-gelu_approximate: none
-head_size: 16
-hf_config: {}
-intermediate_size: 2048
-lm_head_bias: false
-mlp_class_name: LLaMAMLP
-n_embd: 512
-n_expert: 0
-n_expert_per_token: 0
-n_head: 32
-n_layer: 20
-n_query_groups: 4
-name: ''
-norm_class_name: RMSNorm
-norm_eps: 1.0e-05
-padded_vocab_size: 32768
-padding_multiple: 512
-parallel_residual: false
-post_attention_norm: false
-post_mlp_norm: false
-rope_base: 500000
-rope_condense_ratio: 1
-rotary_percentage: 1.0
-scale_embeddings: false
-shared_attention_norm: false
-sliding_window_layer_placing: null
-sliding_window_size: null
-vocab_size: 32768

out/pretrain/final/evaluate/pytorch_model.bin DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:ab60761622286ea11d7ce9f33d6865aa01e9300c9cd59ebf201b20db15aee8ac
-size 216632102

out/pretrain/final/evaluate/results.json DELETED Viewed

File without changes

out/pretrain/final/evaluate/tokenizer.json DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:5b496a30dc268bcb8adfd551f693e68e9eadd06b81cab385c088a61e7663649c
-size 1368561

out/pretrain/final/evaluate/tokenizer_config.json DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:d6333d68c3280be6081b795cc160fd5872707562021f9889b2e2bd3ae508fa62
-size 23043

scripts/TRAIN.md CHANGED Viewed

@@ -57,7 +57,9 @@ model.save_pretrained('out/converted_model/')
 ## Evaluate
 ```bash
-# litgpt evaluate --tasks 'hellaswag,gsm8k,truthfulqa_mc2,mmlu,winogrande,arc_challenge' --batch_size 8 out/pretrain/final/
-litgpt evaluate --tasks 'hellaswag,gsm8k,truthfulqa_mc2,mmlu,mmlu_pro,winogrande,arc_challenge,leaderboard,ifeval,mgsm_direct,mathqa,gpqa' --batch_size 8 out/pretrain/final/
 ```

 ## Evaluate
 ```bash
+litgpt evaluate --tasks 'leaderboard' --out_dir 'evaluate-0/' --batch_size 4 --dtype 'bfloat16' out/pretrain/final/
+litgpt evaluate --tasks 'hellaswag,gsm8k,truthfulqa_mc2,mmlu,winogrande,arc_challenge' --out_dir 'evaluate-1/' --batch_size 4 --dtype 'bfloat16' out/pretrain/final/
+litgpt evaluate --tasks 'mmlu_pro,ifeval,mgsm_direct,mathqa,gpqa' --out_dir 'evaluate-2/' --batch_size 4 --dtype 'bfloat16' out/pretrain/final/
 ```

scripts/requirements-lit.in CHANGED Viewed

@@ -6,6 +6,6 @@ transformers
 bitsandbytes
 wandb
 # litgpt[all]
-litgpt[all] @ git+https://github.com/mtasic85/litgpt.git
 litdata
 grokadamw

 bitsandbytes
 wandb
 # litgpt[all]
+litgpt[all] @ git+https://github.com/Lightning-AI/litgpt.git
 litdata
 grokadamw