facat commited on
Commit
e903c71
1 Parent(s): bdcb689

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +20 -25
README.md CHANGED
@@ -7,10 +7,8 @@ widget:
7
  text: hi
8
  output:
9
  text: ' Hello! How can I assist you today?'
10
-
11
  pipeline_tag: text-generation
12
  ---
13
-
14
  # 🐷SUS-Chat: Instruction tuning done right
15
 
16
  <p align="left">
@@ -187,8 +185,7 @@ data-layout-align="center">
187
  </tr>
188
  <tr class="even">
189
  <td style="text-align: right;">SUS-Chat-34B</td>
190
- <td style="text-align: center;"><span
191
- class="math inline">$\underline{74.35}$</span></td>
192
  </tr>
193
  <tr class="odd">
194
  <td style="text-align: right;">Qwen-72b-Chat</td>
@@ -240,10 +237,8 @@ role="doc-noteref"><sup>1</sup></a></th>
240
  </tr>
241
  <tr class="odd">
242
  <td style="text-align: right;">Qwen-72b-Chat</td>
243
- <td style="text-align: center;"><span
244
- class="math inline">$\underline{77.02}$</span></td>
245
- <td style="text-align: center;"><span
246
- class="math inline">$\underline{77.22}$</span></td>
247
  </tr>
248
  <tr class="even">
249
  <td style="text-align: right;">Deepseek-68b-Chat</td>
@@ -280,25 +275,25 @@ role="doc-backlink">↩︎</a></p></li>
280
 
281
  ## Math & Reasoning
282
 
283
- | Model | gsm8k (0-shot) | MATH (0-shot) | BBH (0-shot) |
284
- |----------------------:|:-------------------:|:-------------------:|:-------------------:|
285
- | GPT-4 | 91.4 | 45.8 | 86.7 |
286
- | SUS-Chat-34B | **80.06** | 28.7 | 67.62 |
287
- | Qwen-72b-Chat | $\underline{76.57}$ | **35.9** | **72.63** |
288
- | Deepseek-68b-Chat | 74.45 | $\underline{29.56}$ | $\underline{69.73}$ |
289
- | OrionStar-Yi-34B-Chat | 54.36 | 12.8 | 62.88 |
290
- | Yi-34B-Chat | 63.76 | 10.02 | 61.54 |
291
 
292
  ## More Tasks
293
 
294
- | Model | winogrande (5-shot) | arc (25-shot) | hellaswag (10-shot) | TruthfulQA mc1 (0-shot) | TruthfulQA mc2 (0-shot) |
295
- |----------------------:|:-------------------:|:-------------------:|:-------------------:|:-----------------------:|:-----------------------:|
296
- | GPT-4 | — | 94.5 | 91.4 | 59.00 | — |
297
- | SUS-Chat-34B | **81.22** | $\underline{81.54}$ | 83.79 | **40.64** | **57.47** |
298
- | Qwen-72b-Chat | 76.09 | **82.10** | $\underline{86.06}$ | 39.17 | $\underline{56.37}$ |
299
- | Deepseek-68b-Chat | $\underline{80.58}$ | 81.29 | **87.02** | $\underline{40.02}$ | 50.64 |
300
- | OrionStar-Yi-34B-Chat | 77.27 | 80.19 | 84.54 | 36.47 | 53.24 |
301
- | Yi-34B-Chat | 76.64 | 70.66 | 82.29 | 38.19 | 54.57 |
302
 
303
  ## Overall
304
 
@@ -400,4 +395,4 @@ model.
400
  This model is developed entirely for academic research and free
401
  commercial use, but it must adhere to the
402
  [license](https://github.com/01-ai/Yi/blob/main/MODEL_LICENSE_AGREEMENT.txt)
403
- from [01-ai](https://huggingface.co/01-ai).
 
7
  text: hi
8
  output:
9
  text: ' Hello! How can I assist you today?'
 
10
  pipeline_tag: text-generation
11
  ---
 
12
  # 🐷SUS-Chat: Instruction tuning done right
13
 
14
  <p align="left">
 
185
  </tr>
186
  <tr class="even">
187
  <td style="text-align: right;">SUS-Chat-34B</td>
188
+ <td style="text-align: center;"><u>74.35</u></td>
 
189
  </tr>
190
  <tr class="odd">
191
  <td style="text-align: right;">Qwen-72b-Chat</td>
 
237
  </tr>
238
  <tr class="odd">
239
  <td style="text-align: right;">Qwen-72b-Chat</td>
240
+ <td style="text-align: center;"><u>77.02</u></td>
241
+ <td style="text-align: center;"><u>77.22</u></td>
 
 
242
  </tr>
243
  <tr class="even">
244
  <td style="text-align: right;">Deepseek-68b-Chat</td>
 
275
 
276
  ## Math & Reasoning
277
 
278
+ | Model | gsm8k (0-shot) | MATH (0-shot) | BBH (0-shot) |
279
+ |----------------------:|:--------------:|:-------------:|:------------:|
280
+ | GPT-4 | 91.4 | 45.8 | 86.7 |
281
+ | SUS-Chat-34B | **80.06** | 28.7 | 67.62 |
282
+ | Qwen-72b-Chat | <u>76.57</u> | **35.9** | **72.63** |
283
+ | Deepseek-68b-Chat | 74.45 | <u>29.56</u> | <u>69.73</u> |
284
+ | OrionStar-Yi-34B-Chat | 54.36 | 12.8 | 62.88 |
285
+ | Yi-34B-Chat | 63.76 | 10.02 | 61.54 |
286
 
287
  ## More Tasks
288
 
289
+ | Model | winogrande (5-shot) | arc (25-shot) | hellaswag (10-shot) | TruthfulQA mc1 (0-shot) | TruthfulQA mc2 (0-shot) |
290
+ |----------------------:|:-------------------:|:-------------:|:-------------------:|:-----------------------:|:-----------------------:|
291
+ | GPT-4 | — | 94.5 | 91.4 | 59.00 | — |
292
+ | SUS-Chat-34B | **81.22** | <u>81.54</u> | 83.79 | **40.64** | **57.47** |
293
+ | Qwen-72b-Chat | 76.09 | **82.10** | <u>86.06</u> | 39.17 | <u>56.37</u> |
294
+ | Deepseek-68b-Chat | <u>80.58</u> | 81.29 | **87.02** | <u>40.02</u> | 50.64 |
295
+ | OrionStar-Yi-34B-Chat | 77.27 | 80.19 | 84.54 | 36.47 | 53.24 |
296
+ | Yi-34B-Chat | 76.64 | 70.66 | 82.29 | 38.19 | 54.57 |
297
 
298
  ## Overall
299
 
 
395
  This model is developed entirely for academic research and free
396
  commercial use, but it must adhere to the
397
  [license](https://github.com/01-ai/Yi/blob/main/MODEL_LICENSE_AGREEMENT.txt)
398
+ from [01-ai](https://huggingface.co/01-ai).