frankminors123
commited on
Commit
•
0671c1f
1
Parent(s):
19456b1
Update README.md
Browse files
README.md
CHANGED
@@ -8,7 +8,7 @@ tags:
|
|
8 |
---
|
9 |
# Chinese-CodeLlama-7B-PT
|
10 |
|
11 |
-
We have further expanded the vocabulary based on Chinese-LLaMA-2-7B which from 55296 to 75548, it is worth noting that the most of them are code tokens. We pre-trained the model based on LoRA
|
12 |
|
13 |
The training data contains approximately 400 million tokens which from high-quality code dataset on HuggingFace.
|
14 |
|
|
|
8 |
---
|
9 |
# Chinese-CodeLlama-7B-PT
|
10 |
|
11 |
+
We have further expanded the vocabulary based on Chinese-LLaMA-2-7B which from 55296 to 75548, it is worth noting that the most of them are code tokens. We pre-trained the model based on LoRA which the rank is 8 and the trainable LoRA layers contain `q_proj` and `v_proj`, at the same time, `embed_tokens` and `lm_head` layers were trained with full parameters.
|
12 |
|
13 |
The training data contains approximately 400 million tokens which from high-quality code dataset on HuggingFace.
|
14 |
|