Update README.md
Browse files
README.md
CHANGED
@@ -42,7 +42,7 @@ model-index:
|
|
42 |
num_few_shot: 3
|
43 |
metrics:
|
44 |
- type: acc_norm
|
45 |
-
value:
|
46 |
name: normalized accuracy
|
47 |
source:
|
48 |
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=sethuiyer/Qwen2.5-7B-Anvita
|
@@ -57,7 +57,7 @@ model-index:
|
|
57 |
num_few_shot: 4
|
58 |
metrics:
|
59 |
- type: exact_match
|
60 |
-
value:
|
61 |
name: exact match
|
62 |
source:
|
63 |
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=sethuiyer/Qwen2.5-7B-Anvita
|
@@ -72,7 +72,7 @@ model-index:
|
|
72 |
num_few_shot: 0
|
73 |
metrics:
|
74 |
- type: acc_norm
|
75 |
-
value:
|
76 |
name: acc_norm
|
77 |
source:
|
78 |
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=sethuiyer/Qwen2.5-7B-Anvita
|
@@ -87,7 +87,7 @@ model-index:
|
|
87 |
num_few_shot: 0
|
88 |
metrics:
|
89 |
- type: acc_norm
|
90 |
-
value:
|
91 |
name: acc_norm
|
92 |
source:
|
93 |
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=sethuiyer/Qwen2.5-7B-Anvita
|
@@ -104,7 +104,7 @@ model-index:
|
|
104 |
num_few_shot: 5
|
105 |
metrics:
|
106 |
- type: acc
|
107 |
-
value:
|
108 |
name: accuracy
|
109 |
source:
|
110 |
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=sethuiyer/Qwen2.5-7B-Anvita
|
@@ -131,13 +131,13 @@ This combination optimizes Anvita for superior reasoning, dynamic conversations,
|
|
131 |
## Evaluation Results
|
132 |
| **Metric** | **Value** |
|
133 |
|-------------------------|--------------:|
|
134 |
-
| **Avg.** | **
|
135 |
-
| **IFEval (0-Shot)** | 64.
|
136 |
-
| **BBH (3-Shot)** |
|
137 |
-
| **MATH Level 5 (4-Shot)**|
|
138 |
-
| **GPQA (0-Shot)** |
|
139 |
-
| **MuSR (0-Shot)** |
|
140 |
-
| **MMLU-PRO (5-Shot)** |
|
141 |
|
142 |
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/sethuiyer__Qwen2.5-7B-Anvita-details).
|
143 |
Personal Benchmarks - check [PERSONAL_BENCHMARK.md](./PERSONAL_BENCHMARK.md)
|
|
|
42 |
num_few_shot: 3
|
43 |
metrics:
|
44 |
- type: acc_norm
|
45 |
+
value: 35.48
|
46 |
name: normalized accuracy
|
47 |
source:
|
48 |
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=sethuiyer/Qwen2.5-7B-Anvita
|
|
|
57 |
num_few_shot: 4
|
58 |
metrics:
|
59 |
- type: exact_match
|
60 |
+
value: 15.86
|
61 |
name: exact match
|
62 |
source:
|
63 |
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=sethuiyer/Qwen2.5-7B-Anvita
|
|
|
72 |
num_few_shot: 0
|
73 |
metrics:
|
74 |
- type: acc_norm
|
75 |
+
value: 10.29
|
76 |
name: acc_norm
|
77 |
source:
|
78 |
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=sethuiyer/Qwen2.5-7B-Anvita
|
|
|
87 |
num_few_shot: 0
|
88 |
metrics:
|
89 |
- type: acc_norm
|
90 |
+
value: 13.47
|
91 |
name: acc_norm
|
92 |
source:
|
93 |
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=sethuiyer/Qwen2.5-7B-Anvita
|
|
|
104 |
num_few_shot: 5
|
105 |
metrics:
|
106 |
- type: acc
|
107 |
+
value: 35.17
|
108 |
name: accuracy
|
109 |
source:
|
110 |
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=sethuiyer/Qwen2.5-7B-Anvita
|
|
|
131 |
## Evaluation Results
|
132 |
| **Metric** | **Value** |
|
133 |
|-------------------------|--------------:|
|
134 |
+
| **Avg.** | **29.18** |
|
135 |
+
| **IFEval (0-Shot)** | 64.8 |
|
136 |
+
| **BBH (3-Shot)** | 35.48 |
|
137 |
+
| **MATH Level 5 (4-Shot)**| 15.86 |
|
138 |
+
| **GPQA (0-Shot)** | 10.29 |
|
139 |
+
| **MuSR (0-Shot)** | 13.47 |
|
140 |
+
| **MMLU-PRO (5-Shot)** | 35.17 |
|
141 |
|
142 |
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/sethuiyer__Qwen2.5-7B-Anvita-details).
|
143 |
Personal Benchmarks - check [PERSONAL_BENCHMARK.md](./PERSONAL_BENCHMARK.md)
|