Update README.md
Browse files
README.md
CHANGED
@@ -13,7 +13,9 @@ license_link: >-
|
|
13 |
<div align="center"><img src="misc/skywork_logo.jpeg" width="550"/></div>
|
14 |
|
15 |
<p align="center">
|
16 |
-
🤗 <a href="https://huggingface.co/Skywork" target="_blank">Hugging Face</a
|
|
|
|
|
17 |
</p>
|
18 |
|
19 |
|
@@ -40,10 +42,10 @@ license_link: >-
|
|
40 |
|
41 |
**Skywork-13B-Base**: The model was trained on a high-quality cleaned dataset consisting of 3.2 trillion multilingual data (mainly Chinese and English) and code. It has demonstrated the best performance among models of similar scale in various evaluations and benchmark tests.
|
42 |
|
43 |
-
如果您希望了解更多的信息,如训练方案,评估方法,请参考我们的[技术报告](https://arxiv.org/skywork-tech-report)和[Skywork-Math](https://arxiv.org/skywork-tech-report)论文。
|
44 |
|
45 |
-
|
46 |
|
|
|
47 |
|
48 |
## 训练数据(Training Data)
|
49 |
我们精心搭建了数据清洗流程对文本中的低质量数据、有害信息、敏感信息进行清洗过滤。我们的Skywork-13B-Base模型是在清洗后的3.2TB高质量中、英、代码数据上进行训练,其中英文占比52.2%,中文占比39.6%,代码占比8%,在兼顾中文和英文上的表现的同时,代码能力也能有保证。
|
@@ -155,7 +157,8 @@ We evaluated Skywork-13B-Base on several popular benchmarks, including C-Eval, M
|
|
155 |
| XVERSE-13B-Base | 54.7 | - | 55.1 | - |
|
156 |
| Baichuan-13B-Base | 52.4 | 55.3 | 51.6 | 26.6 |
|
157 |
| Baichuan-2-13B-Base | 58.1 | 62.0 | 59.2 | 52.3 |
|
158 |
-
| Skywork-13B-Base (ours) |
|
|
|
159 |
|
160 |
## Benchmark评估详细结果
|
161 |
我们给出**Skywork-13B-Base**模型在C-Eval,CMMLU,MMLU上模型的详细结果。
|
@@ -164,9 +167,9 @@ We provide detailed results of the Skywork-13B-Base model on C-EVAL, CMMLU, and
|
|
164 |
|
165 |
| Benchmark | **STEM** | **Humanities** | **Social Science** | **Other** | **China Specific** | **Hard** | **Average** |
|
166 |
|:-----:|:---------:|:--------:|:-------------:|:--------:|:--------:|:--------:|:--------:|
|
167 |
-
| **C-EVAL** | 51.
|
168 |
-
| **CMMLU** | 49.
|
169 |
-
| **MMLU** |
|
170 |
|
171 |
|
172 |
# 快速开始(Quickstart)
|
@@ -331,10 +334,22 @@ If you find our work helpful, please feel free to cite our paper~
|
|
331 |
|
332 |
```
|
333 |
@article{skyworkmath,
|
334 |
-
title={},
|
335 |
-
author={},
|
336 |
-
journal={arXiv preprint arXiv:},
|
|
|
337 |
year={2023}
|
338 |
}
|
339 |
```
|
340 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
13 |
<div align="center"><img src="misc/skywork_logo.jpeg" width="550"/></div>
|
14 |
|
15 |
<p align="center">
|
16 |
+
👨💻 <a href="https://github.com/SkyworkAI/Skywork" target="_blank">Github</a> • 🤗 <a href="https://huggingface.co/Skywork" target="_blank">Hugging Face</a>• 🤖 <a href="https://modelscope.cn/organization/Skywork" target="_blank">ModelScope</a> • 💬 <a href="https://github.com/SkyworkAI/Skywork/blob/main/misc/wechat.png?raw=true" target="_blank">WeChat</a>• 📜<a href="https://arxiv.org/" target="_blank">Tech Report</a>• 🧮<a href="https://arxiv.org/" target="_blank">Skymath Paper</a>
|
17 |
+
• 🖼️<a href="https://github.com/will-singularity/Skywork-MM/blob/main/skywork_mm.pdf" target="_blank">SkyworkMM Paper</a>
|
18 |
+
|
19 |
</p>
|
20 |
|
21 |
|
|
|
42 |
|
43 |
**Skywork-13B-Base**: The model was trained on a high-quality cleaned dataset consisting of 3.2 trillion multilingual data (mainly Chinese and English) and code. It has demonstrated the best performance among models of similar scale in various evaluations and benchmark tests.
|
44 |
|
|
|
45 |
|
46 |
+
如果您希望了解更多的信息,如训练方案,评估方法,请参考我们的[技术报告](https://arxiv.org/skywork-tech-report),[Skymath](https://arxiv.org/abs/2310.16713)论文,[SkyworkMM](https://github.com/will-singularity/Skywork-MM/blob/main/skywork_mm.pdf)论文。
|
47 |
|
48 |
+
If you are interested in more training and evaluation details, please refer to our [technical report](https://arxiv.org/skywork-tech-report), [Skymath]((https://arxiv.org/skywork-tech-report)) paper and [SkyworkMM](https://github.com/will-singularity/Skywork-MM/blob/main/skywork_mm.pdf) paper.
|
49 |
|
50 |
## 训练数据(Training Data)
|
51 |
我们精心搭建了数据清洗流程对文本中的低质量数据、有害信息、敏感信息进行清洗过滤。我们的Skywork-13B-Base模型是在清洗后的3.2TB高质量中、英、代码数据上进行训练,其中英文占比52.2%,中文占比39.6%,代码占比8%,在兼顾中文和英文上的表现的同时,代码能力也能有保证。
|
|
|
157 |
| XVERSE-13B-Base | 54.7 | - | 55.1 | - |
|
158 |
| Baichuan-13B-Base | 52.4 | 55.3 | 51.6 | 26.6 |
|
159 |
| Baichuan-2-13B-Base | 58.1 | 62.0 | 59.2 | 52.3 |
|
160 |
+
| Skywork-13B-Base (ours) | 60.6 | 61.8 | 62.1 | 55.8 |
|
161 |
+
|
162 |
|
163 |
## Benchmark评估详细结果
|
164 |
我们给出**Skywork-13B-Base**模型在C-Eval,CMMLU,MMLU上模型的详细结果。
|
|
|
167 |
|
168 |
| Benchmark | **STEM** | **Humanities** | **Social Science** | **Other** | **China Specific** | **Hard** | **Average** |
|
169 |
|:-----:|:---------:|:--------:|:-------------:|:--------:|:--------:|:--------:|:--------:|
|
170 |
+
| **C-EVAL** | 51.2 | 67.8 | 74.6 | 57.5 | - | 39.4 | 60.6 |
|
171 |
+
| **CMMLU** | 49.5 | 69.3 | 65.9 | 63.3 | 64.2 | - | 61.8 |
|
172 |
+
| **MMLU** | 51.6 | 58.0 | 72.5 | 68.8 | - | - | 62.1 |
|
173 |
|
174 |
|
175 |
# 快速开始(Quickstart)
|
|
|
334 |
|
335 |
```
|
336 |
@article{skyworkmath,
|
337 |
+
title={SkyMath: Technical Report},
|
338 |
+
author={Liu Yang, Haihua Yang, Wenjun Cheng, Lei Lin, Chenxia Li, Yifu Chen, Lunan Liu, Jianfei Pan, Tianwen Wei, Biye Li, Liang Zhao, Lijie Wang, Bo Zhu, Guoliang Li, Xuejie Wu, Xilin Luo, Rui Hu},
|
339 |
+
journal={arXiv preprint arXiv: 2310.16713},
|
340 |
+
url={https://arxiv.org/abs/2309.10305},
|
341 |
year={2023}
|
342 |
}
|
343 |
```
|
344 |
|
345 |
+
|
346 |
+
```
|
347 |
+
@article{Skywork_Multi-Modal_Group_Empirical_Study_Towards_2023,
|
348 |
+
author = {Skywork Multi-Modal Group},
|
349 |
+
month = sep,
|
350 |
+
title = {{Empirical Study Towards Building An Effective Multi-Modal Large Language Model}},
|
351 |
+
year = {2023}
|
352 |
+
}
|
353 |
+
|
354 |
+
```
|
355 |
+
|