Files changed (1) hide show
  1. README.md +117 -1
README.md CHANGED
@@ -1,5 +1,108 @@
1
  ---
2
  license: cc-by-nc-4.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
4
 
5
  new 9b-yi released in may
@@ -18,4 +121,17 @@ applied to llama3 8b instruct
18
  3. The Provider disclaims all liability for any damages or losses resulting from the use or misuse of the Model, including but not limited to any damages or losses arising from the use of the Model for purposes other than those intended by the Provider.
19
  4. The Provider does not endorse or condone the use of the Model for any purpose that violates applicable laws, regulations, or ethical standards.
20
  5. The Provider does not warrant that the Model will meet your specific requirements or that it will be error-free or that it will function without interruption.
21
- 6. You assume all risks associated with the use of the Model, including but not limited to any loss of data, loss of business, or damage to your reputation.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-nc-4.0
3
+ model-index:
4
+ - name: yi-9b-may-ortho-baukit-30fail-3000total-bf16
5
+ results:
6
+ - task:
7
+ type: text-generation
8
+ name: Text Generation
9
+ dataset:
10
+ name: AI2 Reasoning Challenge (25-Shot)
11
+ type: ai2_arc
12
+ config: ARC-Challenge
13
+ split: test
14
+ args:
15
+ num_few_shot: 25
16
+ metrics:
17
+ - type: acc_norm
18
+ value: 60.58
19
+ name: normalized accuracy
20
+ source:
21
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Edgerunners/yi-9b-may-ortho-baukit-30fail-3000total-bf16
22
+ name: Open LLM Leaderboard
23
+ - task:
24
+ type: text-generation
25
+ name: Text Generation
26
+ dataset:
27
+ name: HellaSwag (10-Shot)
28
+ type: hellaswag
29
+ split: validation
30
+ args:
31
+ num_few_shot: 10
32
+ metrics:
33
+ - type: acc_norm
34
+ value: 76.91
35
+ name: normalized accuracy
36
+ source:
37
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Edgerunners/yi-9b-may-ortho-baukit-30fail-3000total-bf16
38
+ name: Open LLM Leaderboard
39
+ - task:
40
+ type: text-generation
41
+ name: Text Generation
42
+ dataset:
43
+ name: MMLU (5-Shot)
44
+ type: cais/mmlu
45
+ config: all
46
+ split: test
47
+ args:
48
+ num_few_shot: 5
49
+ metrics:
50
+ - type: acc
51
+ value: 66.74
52
+ name: accuracy
53
+ source:
54
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Edgerunners/yi-9b-may-ortho-baukit-30fail-3000total-bf16
55
+ name: Open LLM Leaderboard
56
+ - task:
57
+ type: text-generation
58
+ name: Text Generation
59
+ dataset:
60
+ name: TruthfulQA (0-shot)
61
+ type: truthful_qa
62
+ config: multiple_choice
63
+ split: validation
64
+ args:
65
+ num_few_shot: 0
66
+ metrics:
67
+ - type: mc2
68
+ value: 53.38
69
+ source:
70
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Edgerunners/yi-9b-may-ortho-baukit-30fail-3000total-bf16
71
+ name: Open LLM Leaderboard
72
+ - task:
73
+ type: text-generation
74
+ name: Text Generation
75
+ dataset:
76
+ name: Winogrande (5-shot)
77
+ type: winogrande
78
+ config: winogrande_xl
79
+ split: validation
80
+ args:
81
+ num_few_shot: 5
82
+ metrics:
83
+ - type: acc
84
+ value: 72.77
85
+ name: accuracy
86
+ source:
87
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Edgerunners/yi-9b-may-ortho-baukit-30fail-3000total-bf16
88
+ name: Open LLM Leaderboard
89
+ - task:
90
+ type: text-generation
91
+ name: Text Generation
92
+ dataset:
93
+ name: GSM8k (5-shot)
94
+ type: gsm8k
95
+ config: main
96
+ split: test
97
+ args:
98
+ num_few_shot: 5
99
+ metrics:
100
+ - type: acc
101
+ value: 68.01
102
+ name: accuracy
103
+ source:
104
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Edgerunners/yi-9b-may-ortho-baukit-30fail-3000total-bf16
105
+ name: Open LLM Leaderboard
106
  ---
107
 
108
  new 9b-yi released in may
 
121
  3. The Provider disclaims all liability for any damages or losses resulting from the use or misuse of the Model, including but not limited to any damages or losses arising from the use of the Model for purposes other than those intended by the Provider.
122
  4. The Provider does not endorse or condone the use of the Model for any purpose that violates applicable laws, regulations, or ethical standards.
123
  5. The Provider does not warrant that the Model will meet your specific requirements or that it will be error-free or that it will function without interruption.
124
+ 6. You assume all risks associated with the use of the Model, including but not limited to any loss of data, loss of business, or damage to your reputation.
125
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
126
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Edgerunners__yi-9b-may-ortho-baukit-30fail-3000total-bf16)
127
+
128
+ | Metric |Value|
129
+ |---------------------------------|----:|
130
+ |Avg. |66.40|
131
+ |AI2 Reasoning Challenge (25-Shot)|60.58|
132
+ |HellaSwag (10-Shot) |76.91|
133
+ |MMLU (5-Shot) |66.74|
134
+ |TruthfulQA (0-shot) |53.38|
135
+ |Winogrande (5-shot) |72.77|
136
+ |GSM8k (5-shot) |68.01|
137
+