hrsun15 commited on
Commit
c1db185
1 Parent(s): 35bd235

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -5
README.md CHANGED
@@ -13,15 +13,13 @@ datasets:
13
 
14
  ## Model Summary
15
 
16
- FuxiTranyu-8B is a completely open source large language model trained from scratch, with a specific focus on the multilinguality.
17
- FuxiTranyu-8B was trained on 600B tokens with a much more smooth distribution across languages.
18
 
19
- We cover 43 natural languages: Arabic, Bengali, Bulgarian, Burmese, Catalan, Chinese, Czech, Dutch, English, Filipino, Finnish, French, German, Greek, Hebrew, Hindi, Hungarian, Indonesian, Italian, Japanese, Kazakh, Khmer, Korean, Kurdish, Kyrgyz, Lao, Malay, Persian, Polish, Portuguese, Romanian, Russian, Spanish, Swedish, Tamil, Tajik, Thai, Turkish, Turkmen, Ukrainian, Urdu, Uzbek, and Vietnamese.
20
- And also cover 16 programming languages: Java, JavaScript, Python, PHP, C, C++, C#, TypeScript, Go, SQL, Rust, Ruby, Scala, Lua, Assembly, and Visual Basic.
21
 
22
  FuxiTranyu-8B-SFT is an instruct fine-tuned version of [FuxiTranyu-8B](https://huggingface.co/TJUNLP/FuxiTranyu-8B) model.
23
 
24
- More details can be found at our technical report.
25
 
26
  ## Usage
27
  ```python
 
13
 
14
  ## Model Summary
15
 
16
+ FuxiTranyu-8B is an **open-source** **multilingual large language model** trained from scratch, with a specific focus on the multilinguality. It is trained on 600B tokens with a balanced data distribution across languages, exhibiting remarkable multilingual performance compared to previous multilingual LLMs like BLOOM-7B, PolyLM-13B.
 
17
 
18
+ FuxiTranyu supports 43 natural languages (Arabic, Bengali, Bulgarian, Burmese, Catalan, Chinese, Czech, Dutch, English, Filipino, Finnish, French, German, Greek, Hebrew, Hindi, Hungarian, Indonesian, Italian, Japanese, Kazakh, Khmer, Korean, Kurdish, Kyrgyz, Lao, Malay, Persian, Polish, Portuguese, Romanian, Russian, Spanish, Swedish, Tamil, Tajik, Thai, Turkish, Turkmen, Ukrainian, Urdu, Uzbek, and Vietnamese) and cover 16 programming languages (Java, JavaScript, Python, PHP, C, C++, C#, TypeScript, Go, SQL, Rust, Ruby, Scala, Lua, Assembly, and Visual Basic).
 
19
 
20
  FuxiTranyu-8B-SFT is an instruct fine-tuned version of [FuxiTranyu-8B](https://huggingface.co/TJUNLP/FuxiTranyu-8B) model.
21
 
22
+ More details on the data collection & processing, pretraining and fine-tuning of FuxiTranyu can be found in the technical report.
23
 
24
  ## Usage
25
  ```python