fblgit commited on
Commit
4b9fbda
1 Parent(s): d8c42bf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -15
README.md CHANGED
@@ -111,6 +111,7 @@ model-index:
111
  # cybertron-v4-qw7B-UNAMGS
112
 
113
  **UNA IS BACK** Cybertron v4 UNA-MGS, Based on the amazing Qwen2.5 7B
 
114
  **SCORING #1 7-8B LLM WITH NO CONTAMINATION 21.11.2024 with avg. 31.82**
115
 
116
  ![cybertron-v4-MGS](https://huggingface.co/fblgit/cybertron-v4-qw7B-MGS/resolve/main/cybertron_v4MGS.png)
@@ -121,9 +122,34 @@ Here we use our novel approach called `MGS`. Its up to you to figure out what it
121
 
122
  Cybertron V4 went thru SFT with `MGS & UNA` over `Magpie-Align/Magpie-Qwen2.5-Pro-1M-v0.1` dataset.
123
 
 
 
 
 
 
 
 
 
 
 
 
 
124
  ## Quantz
125
  Soon..
126
 
 
 
 
 
 
 
 
 
 
 
 
 
 
127
  ## MGS & UNA & Details
128
 
129
  * MGS, `1+1 = 2 and not 3`
@@ -219,18 +245,4 @@ The following hyperparameters were used during training:
219
  journal={arXiv preprint arXiv:2407.10671},
220
  year={2024}
221
  }
222
- ```
223
-
224
- # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
225
- Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_fblgit__cybertron-v4-qw7B-UNAMGS)
226
-
227
- | Metric |Value|
228
- |-------------------|----:|
229
- |Avg. |31.82|
230
- |IFEval (0-Shot) |60.84|
231
- |BBH (3-Shot) |37.71|
232
- |MATH Lvl 5 (4-Shot)|29.91|
233
- |GPQA (0-shot) |10.85|
234
- |MuSR (0-shot) |12.69|
235
- |MMLU-PRO (5-shot) |38.89|
236
-
 
111
  # cybertron-v4-qw7B-UNAMGS
112
 
113
  **UNA IS BACK** Cybertron v4 UNA-MGS, Based on the amazing Qwen2.5 7B
114
+
115
  **SCORING #1 7-8B LLM WITH NO CONTAMINATION 21.11.2024 with avg. 31.82**
116
 
117
  ![cybertron-v4-MGS](https://huggingface.co/fblgit/cybertron-v4-qw7B-MGS/resolve/main/cybertron_v4MGS.png)
 
122
 
123
  Cybertron V4 went thru SFT with `MGS & UNA` over `Magpie-Align/Magpie-Qwen2.5-Pro-1M-v0.1` dataset.
124
 
125
+ ## Contamination Benchmark
126
+ - MATH:
127
+ ```
128
+ 5gram-Qwen2.5-7B-Instruct-orgn-MATH-test.jsonl: 37.52666666666667
129
+ 5gram-Qwen2.5-7B-Instruct-orgn-MATH-train.jsonl: 46.36666666666667
130
+ ```
131
+ vs
132
+ ```
133
+ 5gram-UNA-cybertron-v4-qw7B-MGS-orgn-MATH-test.jsonl: 37.42666666666667
134
+ 5gram-UNA-cybertron-v4-qw7B-MGS-orgn-MATH-train.jsonl: 46.053333333333335
135
+ ```
136
+
137
  ## Quantz
138
  Soon..
139
 
140
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
141
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_fblgit__cybertron-v4-qw7B-UNAMGS)
142
+
143
+ | Metric |Value|
144
+ |-------------------|----:|
145
+ |Avg. |31.82|
146
+ |IFEval (0-Shot) |60.84|
147
+ |BBH (3-Shot) |37.71|
148
+ |MATH Lvl 5 (4-Shot)|29.91|
149
+ |GPQA (0-shot) |10.85|
150
+ |MuSR (0-shot) |12.69|
151
+ |MMLU-PRO (5-shot) |38.89|
152
+
153
  ## MGS & UNA & Details
154
 
155
  * MGS, `1+1 = 2 and not 3`
 
245
  journal={arXiv preprint arXiv:2407.10671},
246
  year={2024}
247
  }
248
+ ```