leaderboard-pr-bot commited on
Commit
fb131b2
1 Parent(s): ee91097

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +117 -0
README.md CHANGED
@@ -30,6 +30,109 @@ prompt_template: '[INST] <<SYS>>
30
 
31
  '
32
  quantized_by: TheBloke
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
  ---
34
 
35
  <!-- header start -->
@@ -376,3 +479,17 @@ Please report any software “bug,” or other problems with the models through
376
  |7B| [Link](https://huggingface.co/llamaste/Llama-2-7b) | [Link](https://huggingface.co/llamaste/Llama-2-7b-hf) | [Link](https://huggingface.co/llamaste/Llama-2-7b-chat) | [Link](https://huggingface.co/llamaste/Llama-2-7b-chat-hf)|
377
  |13B| [Link](https://huggingface.co/llamaste/Llama-2-13b) | [Link](https://huggingface.co/llamaste/Llama-2-13b-hf) | [Link](https://huggingface.co/llamaste/Llama-2-13b-chat) | [Link](https://huggingface.co/llamaste/Llama-2-13b-hf)|
378
  |70B| [Link](https://huggingface.co/llamaste/Llama-2-70b) | [Link](https://huggingface.co/llamaste/Llama-2-70b-hf) | [Link](https://huggingface.co/llamaste/Llama-2-70b-chat) | [Link](https://huggingface.co/llamaste/Llama-2-70b-hf)|
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
 
31
  '
32
  quantized_by: TheBloke
33
+ model-index:
34
+ - name: Llama-2-7b-Chat-AWQ
35
+ results:
36
+ - task:
37
+ type: text-generation
38
+ name: Text Generation
39
+ dataset:
40
+ name: AI2 Reasoning Challenge (25-Shot)
41
+ type: ai2_arc
42
+ config: ARC-Challenge
43
+ split: test
44
+ args:
45
+ num_few_shot: 25
46
+ metrics:
47
+ - type: acc_norm
48
+ value: 27.22
49
+ name: normalized accuracy
50
+ source:
51
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TheBloke/Llama-2-7b-Chat-AWQ
52
+ name: Open LLM Leaderboard
53
+ - task:
54
+ type: text-generation
55
+ name: Text Generation
56
+ dataset:
57
+ name: HellaSwag (10-Shot)
58
+ type: hellaswag
59
+ split: validation
60
+ args:
61
+ num_few_shot: 10
62
+ metrics:
63
+ - type: acc_norm
64
+ value: 25.48
65
+ name: normalized accuracy
66
+ source:
67
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TheBloke/Llama-2-7b-Chat-AWQ
68
+ name: Open LLM Leaderboard
69
+ - task:
70
+ type: text-generation
71
+ name: Text Generation
72
+ dataset:
73
+ name: MMLU (5-Shot)
74
+ type: cais/mmlu
75
+ config: all
76
+ split: test
77
+ args:
78
+ num_few_shot: 5
79
+ metrics:
80
+ - type: acc
81
+ value: 24.67
82
+ name: accuracy
83
+ source:
84
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TheBloke/Llama-2-7b-Chat-AWQ
85
+ name: Open LLM Leaderboard
86
+ - task:
87
+ type: text-generation
88
+ name: Text Generation
89
+ dataset:
90
+ name: TruthfulQA (0-shot)
91
+ type: truthful_qa
92
+ config: multiple_choice
93
+ split: validation
94
+ args:
95
+ num_few_shot: 0
96
+ metrics:
97
+ - type: mc2
98
+ value: 49.95
99
+ source:
100
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TheBloke/Llama-2-7b-Chat-AWQ
101
+ name: Open LLM Leaderboard
102
+ - task:
103
+ type: text-generation
104
+ name: Text Generation
105
+ dataset:
106
+ name: Winogrande (5-shot)
107
+ type: winogrande
108
+ config: winogrande_xl
109
+ split: validation
110
+ args:
111
+ num_few_shot: 5
112
+ metrics:
113
+ - type: acc
114
+ value: 47.51
115
+ name: accuracy
116
+ source:
117
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TheBloke/Llama-2-7b-Chat-AWQ
118
+ name: Open LLM Leaderboard
119
+ - task:
120
+ type: text-generation
121
+ name: Text Generation
122
+ dataset:
123
+ name: GSM8k (5-shot)
124
+ type: gsm8k
125
+ config: main
126
+ split: test
127
+ args:
128
+ num_few_shot: 5
129
+ metrics:
130
+ - type: acc
131
+ value: 0.0
132
+ name: accuracy
133
+ source:
134
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TheBloke/Llama-2-7b-Chat-AWQ
135
+ name: Open LLM Leaderboard
136
  ---
137
 
138
  <!-- header start -->
 
479
  |7B| [Link](https://huggingface.co/llamaste/Llama-2-7b) | [Link](https://huggingface.co/llamaste/Llama-2-7b-hf) | [Link](https://huggingface.co/llamaste/Llama-2-7b-chat) | [Link](https://huggingface.co/llamaste/Llama-2-7b-chat-hf)|
480
  |13B| [Link](https://huggingface.co/llamaste/Llama-2-13b) | [Link](https://huggingface.co/llamaste/Llama-2-13b-hf) | [Link](https://huggingface.co/llamaste/Llama-2-13b-chat) | [Link](https://huggingface.co/llamaste/Llama-2-13b-hf)|
481
  |70B| [Link](https://huggingface.co/llamaste/Llama-2-70b) | [Link](https://huggingface.co/llamaste/Llama-2-70b-hf) | [Link](https://huggingface.co/llamaste/Llama-2-70b-chat) | [Link](https://huggingface.co/llamaste/Llama-2-70b-hf)|
482
+
483
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
484
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_TheBloke__Llama-2-7b-Chat-AWQ)
485
+
486
+ | Metric |Value|
487
+ |---------------------------------|----:|
488
+ |Avg. |29.14|
489
+ |AI2 Reasoning Challenge (25-Shot)|27.22|
490
+ |HellaSwag (10-Shot) |25.48|
491
+ |MMLU (5-Shot) |24.67|
492
+ |TruthfulQA (0-shot) |49.95|
493
+ |Winogrande (5-shot) |47.51|
494
+ |GSM8k (5-shot) | 0.00|
495
+