Adding Evaluation Results

#2
Files changed (1) hide show
  1. README.md +25 -18
README.md CHANGED
@@ -1,5 +1,10 @@
1
  ---
 
 
 
2
  library_name: transformers
 
 
3
  model-index:
4
  - name: ldm_soup_Llama-3.1-8B-Inst
5
  results:
@@ -16,8 +21,7 @@ model-index:
16
  value: 80.33
17
  name: strict accuracy
18
  source:
19
- url: >-
20
- https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst
21
  name: Open LLM Leaderboard
22
  - task:
23
  type: text-generation
@@ -32,8 +36,7 @@ model-index:
32
  value: 31.1
33
  name: normalized accuracy
34
  source:
35
- url: >-
36
- https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst
37
  name: Open LLM Leaderboard
38
  - task:
39
  type: text-generation
@@ -48,8 +51,7 @@ model-index:
48
  value: 11.56
49
  name: exact match
50
  source:
51
- url: >-
52
- https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst
53
  name: Open LLM Leaderboard
54
  - task:
55
  type: text-generation
@@ -64,8 +66,7 @@ model-index:
64
  value: 5.26
65
  name: acc_norm
66
  source:
67
- url: >-
68
- https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst
69
  name: Open LLM Leaderboard
70
  - task:
71
  type: text-generation
@@ -80,8 +81,7 @@ model-index:
80
  value: 11.52
81
  name: acc_norm
82
  source:
83
- url: >-
84
- https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst
85
  name: Open LLM Leaderboard
86
  - task:
87
  type: text-generation
@@ -98,14 +98,8 @@ model-index:
98
  value: 32.07
99
  name: accuracy
100
  source:
101
- url: >-
102
- https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst
103
  name: Open LLM Leaderboard
104
- license: apache-2.0
105
- language:
106
- - en
107
- base_model:
108
- - meta-llama/Llama-3.1-8B-Instruct
109
  ---
110
 
111
  # Model Card for DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst
@@ -144,4 +138,17 @@ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-le
144
  |MATH Lvl 5 (4-Shot)|11.56|
145
  |GPQA (0-shot) | 5.26|
146
  |MuSR (0-shot) |11.52|
147
- |MMLU-PRO (5-shot) |32.07|
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
  library_name: transformers
6
+ base_model:
7
+ - meta-llama/Llama-3.1-8B-Instruct
8
  model-index:
9
  - name: ldm_soup_Llama-3.1-8B-Inst
10
  results:
 
21
  value: 80.33
22
  name: strict accuracy
23
  source:
24
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst
 
25
  name: Open LLM Leaderboard
26
  - task:
27
  type: text-generation
 
36
  value: 31.1
37
  name: normalized accuracy
38
  source:
39
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst
 
40
  name: Open LLM Leaderboard
41
  - task:
42
  type: text-generation
 
51
  value: 11.56
52
  name: exact match
53
  source:
54
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst
 
55
  name: Open LLM Leaderboard
56
  - task:
57
  type: text-generation
 
66
  value: 5.26
67
  name: acc_norm
68
  source:
69
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst
 
70
  name: Open LLM Leaderboard
71
  - task:
72
  type: text-generation
 
81
  value: 11.52
82
  name: acc_norm
83
  source:
84
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst
 
85
  name: Open LLM Leaderboard
86
  - task:
87
  type: text-generation
 
98
  value: 32.07
99
  name: accuracy
100
  source:
101
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst
 
102
  name: Open LLM Leaderboard
 
 
 
 
 
103
  ---
104
 
105
  # Model Card for DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst
 
138
  |MATH Lvl 5 (4-Shot)|11.56|
139
  |GPQA (0-shot) | 5.26|
140
  |MuSR (0-shot) |11.52|
141
+ |MMLU-PRO (5-shot) |32.07|
142
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
143
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_DeepAutoAI__ldm_soup_Llama-3.1-8B-Inst)
144
+
145
+ | Metric |Value|
146
+ |-------------------|----:|
147
+ |Avg. |28.64|
148
+ |IFEval (0-Shot) |80.33|
149
+ |BBH (3-Shot) |31.10|
150
+ |MATH Lvl 5 (4-Shot)|11.56|
151
+ |GPQA (0-shot) | 5.26|
152
+ |MuSR (0-shot) |11.52|
153
+ |MMLU-PRO (5-shot) |32.07|
154
+