Update README.md
Browse files
README.md
CHANGED
@@ -111,6 +111,7 @@ model-index:
|
|
111 |
# cybertron-v4-qw7B-UNAMGS
|
112 |
|
113 |
**UNA IS BACK** Cybertron v4 UNA-MGS, Based on the amazing Qwen2.5 7B
|
|
|
114 |
**SCORING #1 7-8B LLM WITH NO CONTAMINATION 21.11.2024 with avg. 31.82**
|
115 |
|
116 |
![cybertron-v4-MGS](https://huggingface.co/fblgit/cybertron-v4-qw7B-MGS/resolve/main/cybertron_v4MGS.png)
|
@@ -121,9 +122,34 @@ Here we use our novel approach called `MGS`. Its up to you to figure out what it
|
|
121 |
|
122 |
Cybertron V4 went thru SFT with `MGS & UNA` over `Magpie-Align/Magpie-Qwen2.5-Pro-1M-v0.1` dataset.
|
123 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
124 |
## Quantz
|
125 |
Soon..
|
126 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
127 |
## MGS & UNA & Details
|
128 |
|
129 |
* MGS, `1+1 = 2 and not 3`
|
@@ -219,18 +245,4 @@ The following hyperparameters were used during training:
|
|
219 |
journal={arXiv preprint arXiv:2407.10671},
|
220 |
year={2024}
|
221 |
}
|
222 |
-
```
|
223 |
-
|
224 |
-
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
|
225 |
-
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_fblgit__cybertron-v4-qw7B-UNAMGS)
|
226 |
-
|
227 |
-
| Metric |Value|
|
228 |
-
|-------------------|----:|
|
229 |
-
|Avg. |31.82|
|
230 |
-
|IFEval (0-Shot) |60.84|
|
231 |
-
|BBH (3-Shot) |37.71|
|
232 |
-
|MATH Lvl 5 (4-Shot)|29.91|
|
233 |
-
|GPQA (0-shot) |10.85|
|
234 |
-
|MuSR (0-shot) |12.69|
|
235 |
-
|MMLU-PRO (5-shot) |38.89|
|
236 |
-
|
|
|
111 |
# cybertron-v4-qw7B-UNAMGS
|
112 |
|
113 |
**UNA IS BACK** Cybertron v4 UNA-MGS, Based on the amazing Qwen2.5 7B
|
114 |
+
|
115 |
**SCORING #1 7-8B LLM WITH NO CONTAMINATION 21.11.2024 with avg. 31.82**
|
116 |
|
117 |
![cybertron-v4-MGS](https://huggingface.co/fblgit/cybertron-v4-qw7B-MGS/resolve/main/cybertron_v4MGS.png)
|
|
|
122 |
|
123 |
Cybertron V4 went thru SFT with `MGS & UNA` over `Magpie-Align/Magpie-Qwen2.5-Pro-1M-v0.1` dataset.
|
124 |
|
125 |
+
## Contamination Benchmark
|
126 |
+
- MATH:
|
127 |
+
```
|
128 |
+
5gram-Qwen2.5-7B-Instruct-orgn-MATH-test.jsonl: 37.52666666666667
|
129 |
+
5gram-Qwen2.5-7B-Instruct-orgn-MATH-train.jsonl: 46.36666666666667
|
130 |
+
```
|
131 |
+
vs
|
132 |
+
```
|
133 |
+
5gram-UNA-cybertron-v4-qw7B-MGS-orgn-MATH-test.jsonl: 37.42666666666667
|
134 |
+
5gram-UNA-cybertron-v4-qw7B-MGS-orgn-MATH-train.jsonl: 46.053333333333335
|
135 |
+
```
|
136 |
+
|
137 |
## Quantz
|
138 |
Soon..
|
139 |
|
140 |
+
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
|
141 |
+
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_fblgit__cybertron-v4-qw7B-UNAMGS)
|
142 |
+
|
143 |
+
| Metric |Value|
|
144 |
+
|-------------------|----:|
|
145 |
+
|Avg. |31.82|
|
146 |
+
|IFEval (0-Shot) |60.84|
|
147 |
+
|BBH (3-Shot) |37.71|
|
148 |
+
|MATH Lvl 5 (4-Shot)|29.91|
|
149 |
+
|GPQA (0-shot) |10.85|
|
150 |
+
|MuSR (0-shot) |12.69|
|
151 |
+
|MMLU-PRO (5-shot) |38.89|
|
152 |
+
|
153 |
## MGS & UNA & Details
|
154 |
|
155 |
* MGS, `1+1 = 2 and not 3`
|
|
|
245 |
journal={arXiv preprint arXiv:2407.10671},
|
246 |
year={2024}
|
247 |
}
|
248 |
+
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|