yuyijiong
/

Qwen-14b-chat-yarn-32k

Text Generation

Model card Files Files and versions Community

yuyijiong commited on Dec 15, 2023

Commit

7d0ddf3

•

1 Parent(s): ded233f

Upload 2 files

Files changed (2) hide show

README.md +1 -0
README_en.md +1 -0

README.md CHANGED Viewed

@@ -32,6 +32,7 @@ Qwen-14b-chat-yarn-32k经过微调后，在多文档问答（或检索）任务
 # Usage
 * 使用此模型时会自动设置  ```config.use_logn_attn=False```、```config.use_dynamic_ntk=True```，会产生warning，不影响模型使用。
 * 长文本类型的任务，尽量将长参考文本放在前面，用户的问题放在后面。
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer,AutoConfig

 # Usage
 * 使用此模型时会自动设置  ```config.use_logn_attn=False```、```config.use_dynamic_ntk=True```，会产生warning，不影响模型使用。
 * 长文本类型的任务，尽量将长参考文本放在前面，用户的问题放在后面。
+* 请务必安装```flash-attention2```，否则长文本下推理速度极慢，而且可能会报错。
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer,AutoConfig

README_en.md CHANGED Viewed

@@ -31,6 +31,7 @@ Qwen-14b-chat-yarn-32k has shown significant improvement in multi-document quest
 # Usage
 * When using this model, it will automatically set ```config.use_logn_attn=False``` and ```config.use_dynamic_ntk=True```, resulting in a warning message. Don't mind, this does not affect the model's performance.
 * For tasks involving long texts, it is recommended to place the long reference text before the user's question.
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer,AutoConfig

 # Usage
 * When using this model, it will automatically set ```config.use_logn_attn=False``` and ```config.use_dynamic_ntk=True```, resulting in a warning message. Don't mind, this does not affect the model's performance.
 * For tasks involving long texts, it is recommended to place the long reference text before the user's question.
+* Please make sure to install ```flash attention 2```, otherwise the inference speed under long text will be extremely slow and errors may occur.
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer,AutoConfig