frankminors123 commited on
Commit
c3cfe18
1 Parent(s): 0abf422

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -8,7 +8,9 @@ tags:
8
  ---
9
  # Chinese-CodeLlama-7B-PT
10
 
11
- We have further expanded the vocabulary based on Chinese-LLaMA-2-7B which from 55296 to 75548, it is worth noting that the most of them are code tokens. We pre-trained the model based on LoRA which the rank is 8 and the trainable LoRA layers contain `q_proj` and `v_proj`, at the same time, `embed_tokens` and `lm_head` layers were trained with full parameters. All trainable parameters are float32.
 
 
12
 
13
  The training data contains approximately 400 million tokens which from high-quality code dataset on HuggingFace.
14
 
 
8
  ---
9
  # Chinese-CodeLlama-7B-PT
10
 
11
+ We have further expanded the vocabulary based on Chinese-LLaMA-2-7B which from 55296 to 75548, it is worth noting that the most of them are code tokens. On [MBPP](https://huggingface.co/datasets/mbpp), we calculated the compression rate of the tokenizer to be 38.59%.
12
+
13
+ We pre-trained the model based on LoRA which the rank is 8 and the trainable LoRA layers contain `q_proj` and `v_proj`, at the same time, `embed_tokens` and `lm_head` layers were trained with full parameters. All trainable parameters are float32.
14
 
15
  The training data contains approximately 400 million tokens which from high-quality code dataset on HuggingFace.
16