Text Generation
Transformers
PyTorch
skywork
custom_code
zhao1iang commited on
Commit
612186d
1 Parent(s): 68901eb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -10
README.md CHANGED
@@ -13,7 +13,9 @@ license_link: >-
13
  <div align="center"><img src="misc/skywork_logo.jpeg" width="550"/></div>
14
 
15
  <p align="center">
16
- 🤗 <a href="https://huggingface.co/Skywork" target="_blank">Hugging Face</a> 🤖 <a href="https://modelscope.cn/organization/Skywork" target="_blank">ModelScope</a> • 💬 <a href="https://github.com/SkyworkAI/Skywork/blob/main/misc/wechat.png?raw=true" target="_blank">WeChat</a>• 📜<a href="https://arxiv.org/" target="_blank">Tech Report</a>• 🧮<a href="https://arxiv.org/" target="_blank">Skymath Paper</a>
 
 
17
  </p>
18
 
19
 
@@ -40,10 +42,10 @@ license_link: >-
40
 
41
  **Skywork-13B-Base**: The model was trained on a high-quality cleaned dataset consisting of 3.2 trillion multilingual data (mainly Chinese and English) and code. It has demonstrated the best performance among models of similar scale in various evaluations and benchmark tests.
42
 
43
- 如果您希望了解更多的信息,如训练方案,评估方法,请参考我们的[技术报告](https://arxiv.org/skywork-tech-report)和[Skywork-Math](https://arxiv.org/skywork-tech-report)论文。
44
 
45
- If you are interested in more training and evaluation details, please refer to our [technical report](https://arxiv.org/skywork-tech-report) and [Skywork-Math]((https://arxiv.org/skywork-tech-report)) paper.
46
 
 
47
 
48
  ## 训练数据(Training Data)
49
  我们精心搭建了数据清洗流程对文本中的低质量数据、有害信息、敏感信息进行清洗过滤。我们的Skywork-13B-Base模型是在清洗后的3.2TB高质量中、英、代码数据上进行训练,其中英文占比52.2%,中文占比39.6%,代码占比8%,在兼顾中文和英文上的表现的同时,代码能力也能有保证。
@@ -155,7 +157,8 @@ We evaluated Skywork-13B-Base on several popular benchmarks, including C-Eval, M
155
  | XVERSE-13B-Base | 54.7 | - | 55.1 | - |
156
  | Baichuan-13B-Base | 52.4 | 55.3 | 51.6 | 26.6 |
157
  | Baichuan-2-13B-Base | 58.1 | 62.0 | 59.2 | 52.3 |
158
- | Skywork-13B-Base (ours) | 59.5 | 61.6 | 61.6 | 55.8 |
 
159
 
160
  ## Benchmark评估详细结果
161
  我们给出**Skywork-13B-Base**模型在C-Eval,CMMLU,MMLU上模型的详细结果。
@@ -164,9 +167,9 @@ We provide detailed results of the Skywork-13B-Base model on C-EVAL, CMMLU, and
164
 
165
  | Benchmark | **STEM** | **Humanities** | **Social Science** | **Other** | **China Specific** | **Hard** | **Average** |
166
  |:-----:|:---------:|:--------:|:-------------:|:--------:|:--------:|:--------:|:--------:|
167
- | **C-EVAL** | 51.5 | 65.1 | 73.9 | 55.1 | - | 39.9 | 59.5 |
168
- | **CMMLU** | 49.8 | 68.9 | 65.6 | 62.8 | 63.7 | - | 61.6 |
169
- | **MMLU** | 50.6 | 57.8 | 71.9 | 68.3 | - | - | 61.6 |
170
 
171
 
172
  # 快速开始(Quickstart)
@@ -331,10 +334,22 @@ If you find our work helpful, please feel free to cite our paper~
331
 
332
  ```
333
  @article{skyworkmath,
334
- title={},
335
- author={},
336
- journal={arXiv preprint arXiv:},
 
337
  year={2023}
338
  }
339
  ```
340
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  <div align="center"><img src="misc/skywork_logo.jpeg" width="550"/></div>
14
 
15
  <p align="center">
16
+ 👨‍💻 <a href="https://github.com/SkyworkAI/Skywork" target="_blank">Github</a> • 🤗 <a href="https://huggingface.co/Skywork" target="_blank">Hugging Face</a>• 🤖 <a href="https://modelscope.cn/organization/Skywork" target="_blank">ModelScope</a> • 💬 <a href="https://github.com/SkyworkAI/Skywork/blob/main/misc/wechat.png?raw=true" target="_blank">WeChat</a>• 📜<a href="https://arxiv.org/" target="_blank">Tech Report</a>• 🧮<a href="https://arxiv.org/" target="_blank">Skymath Paper</a>
17
+ • 🖼️<a href="https://github.com/will-singularity/Skywork-MM/blob/main/skywork_mm.pdf" target="_blank">SkyworkMM Paper</a>
18
+
19
  </p>
20
 
21
 
 
42
 
43
  **Skywork-13B-Base**: The model was trained on a high-quality cleaned dataset consisting of 3.2 trillion multilingual data (mainly Chinese and English) and code. It has demonstrated the best performance among models of similar scale in various evaluations and benchmark tests.
44
 
 
45
 
46
+ 如果您希望了解更多的信息,如训练方案,评估方法,请参考我们的[技术报告](https://arxiv.org/skywork-tech-report)[Skymath](https://arxiv.org/abs/2310.16713)论文,[SkyworkMM](https://github.com/will-singularity/Skywork-MM/blob/main/skywork_mm.pdf)论文。
47
 
48
+ If you are interested in more training and evaluation details, please refer to our [technical report](https://arxiv.org/skywork-tech-report), [Skymath]((https://arxiv.org/skywork-tech-report)) paper and [SkyworkMM](https://github.com/will-singularity/Skywork-MM/blob/main/skywork_mm.pdf) paper.
49
 
50
  ## 训练数据(Training Data)
51
  我们精心搭建了数据清洗流程对文本中的低质量数据、有害信息、敏感信息进行清洗过滤。我们的Skywork-13B-Base模型是在清洗后的3.2TB高质量中、英、代码数据上进行训练,其中英文占比52.2%,中文占比39.6%,代码占比8%,在兼顾中文和英文上的表现的同时,代码能力也能有保证。
 
157
  | XVERSE-13B-Base | 54.7 | - | 55.1 | - |
158
  | Baichuan-13B-Base | 52.4 | 55.3 | 51.6 | 26.6 |
159
  | Baichuan-2-13B-Base | 58.1 | 62.0 | 59.2 | 52.3 |
160
+ | Skywork-13B-Base (ours) | 60.6 | 61.8 | 62.1 | 55.8 |
161
+
162
 
163
  ## Benchmark评估详细结果
164
  我们给出**Skywork-13B-Base**模型在C-Eval,CMMLU,MMLU上模型的详细结果。
 
167
 
168
  | Benchmark | **STEM** | **Humanities** | **Social Science** | **Other** | **China Specific** | **Hard** | **Average** |
169
  |:-----:|:---------:|:--------:|:-------------:|:--------:|:--------:|:--------:|:--------:|
170
+ | **C-EVAL** | 51.2 | 67.8 | 74.6 | 57.5 | - | 39.4 | 60.6 |
171
+ | **CMMLU** | 49.5 | 69.3 | 65.9 | 63.3 | 64.2 | - | 61.8 |
172
+ | **MMLU** | 51.6 | 58.0 | 72.5 | 68.8 | - | - | 62.1 |
173
 
174
 
175
  # 快速开始(Quickstart)
 
334
 
335
  ```
336
  @article{skyworkmath,
337
+ title={SkyMath: Technical Report},
338
+ author={Liu Yang, Haihua Yang, Wenjun Cheng, Lei Lin, Chenxia Li, Yifu Chen, Lunan Liu, Jianfei Pan, Tianwen Wei, Biye Li, Liang Zhao, Lijie Wang, Bo Zhu, Guoliang Li, Xuejie Wu, Xilin Luo, Rui Hu},
339
+ journal={arXiv preprint arXiv: 2310.16713},
340
+ url={https://arxiv.org/abs/2309.10305},
341
  year={2023}
342
  }
343
  ```
344
 
345
+
346
+ ```
347
+ @article{Skywork_Multi-Modal_Group_Empirical_Study_Towards_2023,
348
+ author = {Skywork Multi-Modal Group},
349
+ month = sep,
350
+ title = {{Empirical Study Towards Building An Effective Multi-Modal Large Language Model}},
351
+ year = {2023}
352
+ }
353
+
354
+ ```
355
+