Bojun-Feng commited on
Commit
93cf6ab
1 Parent(s): 04d3221

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -0
README.md CHANGED
@@ -1,3 +1,34 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+
5
+ # THUDM's chatglm 6B GGML
6
+
7
+ These files are GGML format model files for [THUDM's chatglm 6B](https://huggingface.co/THUDM/chatglm-6b).
8
+
9
+ GGML files are for CPU + GPU inference using [chatglm.cpp](https://github.com/li-plus/chatglm.cpp) and xorbits-inference (coming soon).
10
+
11
+ # Prompt template
12
+ **NOTE**: prompt template is not available yet since the system prompt is hard coded in chatglm.cpp for now.
13
+
14
+ # Provided files
15
+
16
+ | Name | Quant method | Bits | Size |
17
+ |------|--------------|------|------|
18
+ | chatglm-ggml-q4_0.bin | q4_0 | 4 | 3.5 GB |
19
+ | chatglm-ggml-q4_1.bin | q4_1 | 4 | 3.9 GB |
20
+ | chatglm-ggml-q5_0.bin | q5_0 | 5 | 4.3 GB |
21
+ | chatglm-ggml-q5_1.bin | q5_1 | 5 | 4.6 GB |
22
+ | chatglm-ggml-q5_1.bin | q8_0 | 8 | 6.6 GB |
23
+
24
+
25
+ # How to run in xorbits-inference
26
+ Coming soon.
27
+
28
+ # Slack
29
+ For further support, and discussions on these models and AI in general, join our [slack channel](https://join.slack.com/t/xorbitsio/shared_invite/zt-1o3z9ucdh-RbfhbPVpx7prOVdM1CAuxg)!
30
+
31
+ # Original model card: THUDM's chatglm 6B
32
+ ChatGLM-6B is an open bilingual language model based on General Language Model (GLM) framework, with 6.2 billion parameters. With the quantization technique, users can deploy locally on consumer-grade graphics cards (only 6GB of GPU memory is required at the INT4 quantization level). ChatGLM-6B uses technology similar to ChatGPT, optimized for Chinese QA and dialogue. The model is trained for about 1T tokens of Chinese and English corpus, supplemented by supervised fine-tuning, feedback bootstrap, and reinforcement learning with human feedback. With only about 6.2 billion parameters, the model is able to generate answers that are in line with human preference.
33
+
34
+ For more instructions, including how to run CLI and web demos, and model quantization, please refer to our [Github Repo](https://github.com/THUDM/ChatGLM-6B).