fblgit commited on
Commit
2ff0836
1 Parent(s): e1cdc5b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -7
README.md CHANGED
@@ -2,6 +2,7 @@
2
  license: apache-2.0
3
  datasets:
4
  - fblgit/simple-math
 
5
  base_model: abacusai/Smaug-34B-v0.1
6
  tags:
7
  - UNA
@@ -11,24 +12,38 @@ tags:
11
 
12
  # UNA-SimpleSmaug-34b-v1beta
13
 
14
- So far an experiment, not sure how it went. Applied UNA only on the Attention, not on the MLP's
 
 
15
  * Is based on Smaug
16
  * SimpleMath dataset
17
  * It was trained on Axolotl
18
 
19
  ## Experiment
20
  The thing here is to understand whats the impact of SimpleMath applied at the attention layer during a SFT session and how it impacts on the neural network overall.
 
 
 
21
  ## Evals
22
 
23
  Pending, but so far this one
24
  ```
25
- | Task |Version| Metric |Value | |Stderr|
26
- |-------------|------:|--------|-----:|---|-----:|
27
- |arc_challenge| 0|acc |0.7201|± |0.0131|
28
- | | |acc_norm|0.7457|± |0.0127|
 
 
 
 
 
 
29
  ```
30
 
31
- Seems to increase GSM and ARC
32
 
33
  ## Citations
34
- To abacusai for making Smaug-34B, the Bagel, and all the magic behind the base model.
 
 
 
 
2
  license: apache-2.0
3
  datasets:
4
  - fblgit/simple-math
5
+ - jondurbin/bagel-v0.3
6
  base_model: abacusai/Smaug-34B-v0.1
7
  tags:
8
  - UNA
 
12
 
13
  # UNA-SimpleSmaug-34b-v1beta
14
 
15
+ Scoring 04-February-2024 #1 34B model, outperforming its original base model Smaug-34B-v0.1 with `77.41` 😎
16
+
17
+ Applied UNA only on the Attention, not on the MLP's
18
  * Is based on Smaug
19
  * SimpleMath dataset
20
  * It was trained on Axolotl
21
 
22
  ## Experiment
23
  The thing here is to understand whats the impact of SimpleMath applied at the attention layer during a SFT session and how it impacts on the neural network overall.
24
+
25
+ Results: Improving mathematican and reasoning capabilities without degrading and presserving previous training sessions.
26
+
27
  ## Evals
28
 
29
  Pending, but so far this one
30
  ```
31
+ | Task |Version| Metric |Value |
32
+ |-------------|------:|--------|----------------:|
33
+ |arc_challenge| HF|acc_norm| 0.7457337883959 |
34
+ |gsm8k | HF|acc | 0.7247915087187 |
35
+ |mmlu | HF|acc | 0.7649553475572 |
36
+ |mmlu | HF|acc_norm| 0.7681713551647 |
37
+ |hellaswag | HF|acc_norm| 0.8673571001792 |
38
+ |truthfulqa | HF|mc2 | 0.7016557407771 |
39
+ |winogrande | HF|acc | 0.8382004735595 |
40
+ |------------------------------------------------|
41
  ```
42
 
43
+ Increasing GSM, MMLU, ARC, WINO.
44
 
45
  ## Citations
46
+ To abacusai for making Smaug-34B, the Bagel, and all the magic behind the base model.
47
+
48
+ If you use the model, provide citation even for merges or anything.
49
+ And enjoy our ModelSimilarities tool detector https://github.com/fblgit/model-similarity