thomas-yanxin
/

XinYuan-Qwen2-7B

Model card Files Files and versions Community

thomas-yanxin commited on Aug 23

Commit

c62d83e

•

1 Parent(s): 7fccd26

Update README.md

Files changed (1) hide show

README.md +18 -1

README.md CHANGED Viewed

@@ -10,4 +10,21 @@ datasets:
 The main purpose of this model is to validate the usability of [thomas-yanxin/MT-SFT-ShareGPT](https://huggingface.co/datasets/thomas-yanxin/MT-SFT-ShareGPT), i.e., the quality of the data is all you need. We found that when we meticulously extract the data through a better data governance approach, the corresponding model results can be vastly improved, even if only through SFT.
-Here are the results from our OpenCompass evaluation：

 The main purpose of this model is to validate the usability of [thomas-yanxin/MT-SFT-ShareGPT](https://huggingface.co/datasets/thomas-yanxin/MT-SFT-ShareGPT), i.e., the quality of the data is all you need. We found that when we meticulously extract the data through a better data governance approach, the corresponding model results can be vastly improved, even if only through SFT.
+Here are the results from our OpenCompass evaluation：
+| Classification | Benchmarks | Models |
+| :------------: | :--------: | :--------: |
+|        | 名称   | XinYuan-Qwen2-7B |
+| English       | MMLU   | 68.71            |
+|            | MMLU-Pro | 30.56            |
+|            | Theorem QA | 25.3             |
+|            | GPQA    | 29.2             |
+|            | BBH     | 60.3             |
+|        | IFEval (Prompt Strict-Acc.) | 39.2 |
+|        | ARC-C   | 87.5             |
+| Math       | GSM8K   | 75.4             |
+|            | MATH    | 34.76            |
+| Chinese       | C-EVAL  | 82.0             |
+|            | CMMLU   | 77.9             |
+| Code       | MBPP    | 50.6             |
+|            | HumanEval | 70.1             |