Upload 2 files
Browse files- README.md +1 -0
- README_en.md +1 -0
README.md
CHANGED
@@ -32,6 +32,7 @@ Qwen-14b-chat-yarn-32k经过微调后,在多文档问答(或检索)任务
|
|
32 |
# Usage
|
33 |
* 使用此模型时会自动设置 ```config.use_logn_attn=False```、```config.use_dynamic_ntk=True```,会产生warning,不影响模型使用。
|
34 |
* 长文本类型的任务,尽量将长参考文本放在前面,用户的问题放在后面。
|
|
|
35 |
```python
|
36 |
from transformers import AutoModelForCausalLM, AutoTokenizer,AutoConfig
|
37 |
|
|
|
32 |
# Usage
|
33 |
* 使用此模型时会自动设置 ```config.use_logn_attn=False```、```config.use_dynamic_ntk=True```,会产生warning,不影响模型使用。
|
34 |
* 长文本类型的任务,尽量将长参考文本放在前面,用户的问题放在后面。
|
35 |
+
* 请务必安装```flash-attention2```,否则长文本下推理速度极慢,而且可能会报错。
|
36 |
```python
|
37 |
from transformers import AutoModelForCausalLM, AutoTokenizer,AutoConfig
|
38 |
|
README_en.md
CHANGED
@@ -31,6 +31,7 @@ Qwen-14b-chat-yarn-32k has shown significant improvement in multi-document quest
|
|
31 |
# Usage
|
32 |
* When using this model, it will automatically set ```config.use_logn_attn=False``` and ```config.use_dynamic_ntk=True```, resulting in a warning message. Don't mind, this does not affect the model's performance.
|
33 |
* For tasks involving long texts, it is recommended to place the long reference text before the user's question.
|
|
|
34 |
```python
|
35 |
from transformers import AutoModelForCausalLM, AutoTokenizer,AutoConfig
|
36 |
|
|
|
31 |
# Usage
|
32 |
* When using this model, it will automatically set ```config.use_logn_attn=False``` and ```config.use_dynamic_ntk=True```, resulting in a warning message. Don't mind, this does not affect the model's performance.
|
33 |
* For tasks involving long texts, it is recommended to place the long reference text before the user's question.
|
34 |
+
* Please make sure to install ```flash attention 2```, otherwise the inference speed under long text will be extremely slow and errors may occur.
|
35 |
```python
|
36 |
from transformers import AutoModelForCausalLM, AutoTokenizer,AutoConfig
|
37 |
|