fblgit
/

UNA-SimpleSmaug-34b-v1beta

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

fblgit commited on Feb 10

Commit

2ff0836

•

1 Parent(s): e1cdc5b

Update README.md

Files changed (1) hide show

README.md +22 -7

README.md CHANGED Viewed

@@ -2,6 +2,7 @@
 license: apache-2.0
 datasets:
 - fblgit/simple-math
 base_model: abacusai/Smaug-34B-v0.1
 tags:
 - UNA
@@ -11,24 +12,38 @@ tags:
 # UNA-SimpleSmaug-34b-v1beta
-So far an experiment, not sure how it went. Applied UNA only on the Attention, not on the MLP's
 * Is based on Smaug
 * SimpleMath dataset
 * It was trained on Axolotl
 ## Experiment
 The thing here is to understand whats the impact of SimpleMath applied at the attention layer during a SFT session and how it impacts on the neural network overall.
 ## Evals
 Pending, but so far this one
 ```
-|    Task     |Version| Metric |Value |   |Stderr|
-|-------------|------:|--------|-----:|---|-----:|
-|arc_challenge|      0|acc     |0.7201|±  |0.0131|
-|             |       |acc_norm|0.7457|±  |0.0127|
 ```
-Seems to increase GSM and ARC
 ## Citations
-To abacusai for making Smaug-34B, the Bagel, and all the magic behind the base model.

 license: apache-2.0
 datasets:
 - fblgit/simple-math
+- jondurbin/bagel-v0.3
 base_model: abacusai/Smaug-34B-v0.1
 tags:
 - UNA
 # UNA-SimpleSmaug-34b-v1beta
+Scoring 04-February-2024 #1 34B model, outperforming its original base model Smaug-34B-v0.1 with `77.41` 😎
+Applied UNA only on the Attention, not on the MLP's
 * Is based on Smaug
 * SimpleMath dataset
 * It was trained on Axolotl
 ## Experiment
 The thing here is to understand whats the impact of SimpleMath applied at the attention layer during a SFT session and how it impacts on the neural network overall.
+Results: Improving mathematican and reasoning capabilities without degrading and presserving previous training sessions.
 ## Evals
 Pending, but so far this one
 ```
+|    Task     |Version| Metric |Value            |
+|-------------|------:|--------|----------------:|
+|arc_challenge|     HF|acc_norm| 0.7457337883959 |
+|gsm8k        |     HF|acc     | 0.7247915087187 |
+|mmlu         |     HF|acc     | 0.7649553475572 |
+|mmlu         |     HF|acc_norm| 0.7681713551647 |
+|hellaswag    |     HF|acc_norm| 0.8673571001792 |
+|truthfulqa   |     HF|mc2     | 0.7016557407771 |
+|winogrande   |     HF|acc     | 0.8382004735595 |
+|------------------------------------------------|
 ```
+Increasing GSM, MMLU, ARC, WINO.
 ## Citations
+To abacusai for making Smaug-34B, the Bagel, and all the magic behind the base model.
+If you use the model, provide citation even for merges or anything.
+And enjoy our ModelSimilarities tool detector https://github.com/fblgit/model-similarity