wenge-research
/

yayi2-30b

Text Generation

Transformers

PyTorch

yayi

custom_code

Model card Files Files and versions Community

wenge-research commited on Dec 15, 2023

Commit

321524b

•

1 Parent(s): dc16eed

Update README.md

Browse files

Files changed (1) hide show

README.md +36 -22

README.md CHANGED Viewed

@@ -27,13 +27,48 @@ YAYI 2 is a collection of open-source large language models launched by Wenge Te
 For more details about the YAYI 2, please refer to our GitHub repository. Stay tuned for more technical details in our upcoming technical report! 🔥
-## 模型/Model
 | Model Name | Context Length  | 🤗 HF Model Name |
 |:----------|:----------:|:----------:|
 | YAYI2-30B | 4096    | wenge-research/yayi2-30b|
 ## 评测结果/Evaluation
 我们在多个基准数据集上进行了评测，包括 C-Eval、MMLU、 CMMLU、AGIEval、GAOKAO-Bench、GSM8K、MATH、BBH、HumanEval 以及 MBPP。我们考察了模型在语言理解、学科知识、数学推理、逻辑推理以及代码生成方面的表现。YAYI 2 模型在与其规模相近的开源模型中展现出了显著的性能提升。
@@ -200,27 +235,6 @@ We evaluate our model on standard benchmarks, including C-Eval, MMLU, CMMLU, AGI
 We evaluate our model using the source code from the [OpenCompass Github repository](https://github.com/open-compass/opencompass). If available, we report results for comparative models assessed by OpenCompass with the evaluation reference date set to Dec. 15th, 2013. For MPT, Falfon, and Llama, which have not been evaluated by OpenCompass, we use the results reported in the [LLaMA 2](https://arxiv.org/abs/2307.09288) paper.
-## 快速开始/Quick Start
-```python
->>> from transformers import AutoModelForCausalLM, AutoTokenizer
->>> tokenizer = AutoTokenizer.from_pretrained("wenge-research/yayi2-30b", trust_remote_code=True)
->>> model = AutoModelForCausalLM.from_pretrained("wenge-research/yayi2-30b", device_map="auto", trust_remote_code=True)
->>> inputs = tokenizer('The winter in Beijing is', return_tensors='pt')
->>> inputs = inputs.to('cuda')
->>> pred = model.generate(
-        **inputs,
-        max_new_tokens=256,
-        eos_token_id=tokenizer.eos_token_id,
-        do_sample=True,
-        repetition_penalty=1.2,
-        temperature=0.4,
-        top_k=100,
-        top_p=0.8
-        )
->>> print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))
-```
 ## 协议/Liencese

 For more details about the YAYI 2, please refer to our GitHub repository. Stay tuned for more technical details in our upcoming technical report! 🔥
+## 模型细节/Model
 | Model Name | Context Length  | 🤗 HF Model Name |
 |:----------|:----------:|:----------:|
 | YAYI2-30B | 4096    | wenge-research/yayi2-30b|
+## 要求 Requirements）
+python 3.8及以上版本
+pytorch 1.12及以上版本，推荐2.0及以上版本
+建议使用CUDA 11.4及以上（GPU用户、flash-attention用户等需考虑此选项）
+运行BF16或FP16模型需要多卡至少144GB显存（例如2xA100-80G或5xV100-32G）；运行Int4模型至少需要48GB显存（例如1xA100-80G或2xV100-32G）。
+python 3.8 and above
+pytorch 1.12 and above, 2.0 and above are recommended
+CUDA 11.4 and above are recommended (this is for GPU users, flash-attention users, etc.) To run Qwen-72B-Chat in bf16/fp16, at least 144GB GPU memory is required (e.g., 2xA100-80G or 5xV100-32G). To run it in int4, at least 48GB GPU memory is requred (e.g., 1xA100-80G or 2xV100-32G).
+## 快速开始/Quick Start
+```python
+>>> from transformers import AutoModelForCausalLM, AutoTokenizer
+>>> tokenizer = AutoTokenizer.from_pretrained("wenge-research/yayi2-30b", trust_remote_code=True)
+>>> model = AutoModelForCausalLM.from_pretrained("wenge-research/yayi2-30b", device_map="auto", trust_remote_code=True)
+>>> inputs = tokenizer('The winter in Beijing is', return_tensors='pt')
+>>> inputs = inputs.to('cuda')
+>>> pred = model.generate(
+        **inputs,
+        max_new_tokens=256,
+        eos_token_id=tokenizer.eos_token_id,
+        do_sample=True,
+        repetition_penalty=1.2,
+        temperature=0.4,
+        top_k=100,
+        top_p=0.8
+        )
+>>> print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))
+```
 ## 评测结果/Evaluation
 我们在多个基准数据集上进行了评测，包括 C-Eval、MMLU、 CMMLU、AGIEval、GAOKAO-Bench、GSM8K、MATH、BBH、HumanEval 以及 MBPP。我们考察了模型在语言理解、学科知识、数学推理、逻辑推理以及代码生成方面的表现。YAYI 2 模型在与其规模相近的开源模型中展现出了显著的性能提升。
 We evaluate our model using the source code from the [OpenCompass Github repository](https://github.com/open-compass/opencompass). If available, we report results for comparative models assessed by OpenCompass with the evaluation reference date set to Dec. 15th, 2013. For MPT, Falfon, and Llama, which have not been evaluated by OpenCompass, we use the results reported in the [LLaMA 2](https://arxiv.org/abs/2307.09288) paper.
 ## 协议/Liencese