File size: 1,237 Bytes
9eab161 7fccd26 55175a5 16e7ffc 55175a5 7fccd26 9eab161 9fe0fcb c62d83e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
---
license: other
language:
- zh
- en
datasets:
- thomas-yanxin/MT-SFT-ShareGPT
---
The main purpose of this model is to validate the usability of [thomas-yanxin/MT-SFT-ShareGPT](https://huggingface.co/datasets/thomas-yanxin/MT-SFT-ShareGPT), i.e., the quality of the data is all you need. We found that when we meticulously extract the data through a better data governance approach, the corresponding model results can be vastly improved, even if only through SFT.
Here are the results from our OpenCompass evaluation:
| Classification | Benchmarks | Models |
| :------------: | :--------: | :--------: |
| | 名称 | XinYuan-Qwen2-7B |
| English | MMLU | 68.71 |
| | MMLU-Pro | 30.56 |
| | Theorem QA | 25.3 |
| | GPQA | 29.2 |
| | BBH | 60.3 |
| | IFEval (Prompt Strict-Acc.) | 39.2 |
| | ARC-C | 87.5 |
| Math | GSM8K | 75.4 |
| | MATH | 34.76 |
| Chinese | C-EVAL | 82.0 |
| | CMMLU | 77.9 |
| Code | MBPP | 50.6 |
| | HumanEval | 70.1 | |