IlysvlVEizbr commited on
Commit
970e4b6
1 Parent(s): 29cbdb2
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -244,7 +244,7 @@ The accuracy of Qwen-7B-Chat on GSM8K is shown below
244
 
245
  通过NTK插值,LogN注意力缩放可以扩展Qwen-7B-Chat的上下文长度。在长文本摘要数据集[VCSUM](https://arxiv.org/abs/2305.05280)上(文本平均长度在15K左右),Qwen-7B-Chat的Rouge-L结果如下:
246
 
247
- **(若要启用这些技巧,请将config.json里的`use_dynamc_ntk`和`use_logn_attn`设置为true)**
248
 
249
  We introduce NTK-aware interpolation, LogN attention scaling to extend the context length of Qwen-7B-Chat. The Rouge-L results of Qwen-7B-Chat on long-text summarization dataset [VCSUM](https://arxiv.org/abs/2305.05280) (The average length of this dataset is around 15K) are shown below:
250
 
 
244
 
245
  通过NTK插值,LogN注意力缩放可以扩展Qwen-7B-Chat的上下文长度。在长文本摘要数据集[VCSUM](https://arxiv.org/abs/2305.05280)上(文本平均长度在15K左右),Qwen-7B-Chat的Rouge-L结果如下:
246
 
247
+ **(若要启用这些技巧,请将config.json里的`use_dynamic_ntk`和`use_logn_attn`设置为true)**
248
 
249
  We introduce NTK-aware interpolation, LogN attention scaling to extend the context length of Qwen-7B-Chat. The Rouge-L results of Qwen-7B-Chat on long-text summarization dataset [VCSUM](https://arxiv.org/abs/2305.05280) (The average length of this dataset is around 15K) are shown below:
250