frankminors123
/

Chinese-CodeLlama-7B-PT

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

frankminors123 commited on Sep 27, 2023

Commit

a3a74c7

•

1 Parent(s): eb15d06

Create README.md

Files changed (1) hide show

README.md +17 -0

README.md ADDED Viewed

	@@ -0,0 +1,17 @@

+---
+license: apache-2.0
+language:
+- zh
+- en
+tags:
+- code
+---
+# Chinese-CodeLlama-7B-PT
+We have further expanded the vocabulary based on Chinese-LLaMA-2-7B which from 599296 to 75548, and we pre-trained the model based on LoRA, including `embed_tokens` and `lm_head` layers.
+The training data contains approximately 400 million tokens, which from high-quality code dataset on HuggingFace.
+In addition, we applied `memory_efficient_attention` to the pre-training, which saves us a lot of GPU memory space.
+Our model can be used for SFT, and we hope to contribute more valuable work in the Chinese field.