frankminors123 commited on
Commit
a3a74c7
1 Parent(s): eb15d06

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -0
README.md ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - zh
5
+ - en
6
+ tags:
7
+ - code
8
+ ---
9
+ # Chinese-CodeLlama-7B-PT
10
+
11
+ We have further expanded the vocabulary based on Chinese-LLaMA-2-7B which from 599296 to 75548, and we pre-trained the model based on LoRA, including `embed_tokens` and `lm_head` layers.
12
+
13
+ The training data contains approximately 400 million tokens, which from high-quality code dataset on HuggingFace.
14
+
15
+ In addition, we applied `memory_efficient_attention` to the pre-training, which saves us a lot of GPU memory space.
16
+
17
+ Our model can be used for SFT, and we hope to contribute more valuable work in the Chinese field.