frankminors123 commited on
Commit
8bcfb35
1 Parent(s): 49ad1ea

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -11,6 +11,8 @@ the base period of rotary positional embeddings (RoPE) from 10000 to 1000000.
11
 
12
  We use a sequence length of 1k for pre-training, and continue training based on this length during the fine-tuning stage. Based on a larger base period of RoPE, it can support up 15k context length extrapolation at inference time.
13
 
 
 
14
  The Chinese prompt template used is as follows:
15
  ```python
16
  PROMPT_TEMPLATE = (
 
11
 
12
  We use a sequence length of 1k for pre-training, and continue training based on this length during the fine-tuning stage. Based on a larger base period of RoPE, it can support up 15k context length extrapolation at inference time.
13
 
14
+ Based on this [dataset](https://huggingface.co/datasets/code_search_net), we calculate the average of PPL on 1k length text to be 5.44. However, this value is 148.70 based on our pre-trained model.
15
+
16
  The Chinese prompt template used is as follows:
17
  ```python
18
  PROMPT_TEMPLATE = (