--- license: other language: - zh - en datasets: - thomas-yanxin/MT-SFT-ShareGPT --- The main purpose of this model is to validate the usability of [thomas-yanxin/MT-SFT-ShareGPT](https://huggingface.co/datasets/thomas-yanxin/MT-SFT-ShareGPT), i.e., the quality of the data is all you need. We found that when we meticulously extract the data through a better data governance approach, the corresponding model results can be vastly improved, even if only through SFT. Here are the results from our OpenCompass evaluation: | Classification | Benchmarks | Models | | :------------: | :--------: | :--------: | | | 名称 | XinYuan-Qwen2-7B | | English | MMLU | 68.71 | | | MMLU-Pro | 30.56 | | | Theorem QA | 25.3 | | | GPQA | 29.2 | | | BBH | 60.3 | | | IFEval (Prompt Strict-Acc.) | 39.2 | | | ARC-C | 87.5 | | Math | GSM8K | 75.4 | | | MATH | 34.76 | | Chinese | C-EVAL | 82.0 | | | CMMLU | 77.9 | | Code | MBPP | 50.6 | | | HumanEval | 70.1 |