Bojun-Feng
commited on
Commit
•
93cf6ab
1
Parent(s):
04d3221
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,34 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
+
|
5 |
+
# THUDM's chatglm 6B GGML
|
6 |
+
|
7 |
+
These files are GGML format model files for [THUDM's chatglm 6B](https://huggingface.co/THUDM/chatglm-6b).
|
8 |
+
|
9 |
+
GGML files are for CPU + GPU inference using [chatglm.cpp](https://github.com/li-plus/chatglm.cpp) and xorbits-inference (coming soon).
|
10 |
+
|
11 |
+
# Prompt template
|
12 |
+
**NOTE**: prompt template is not available yet since the system prompt is hard coded in chatglm.cpp for now.
|
13 |
+
|
14 |
+
# Provided files
|
15 |
+
|
16 |
+
| Name | Quant method | Bits | Size |
|
17 |
+
|------|--------------|------|------|
|
18 |
+
| chatglm-ggml-q4_0.bin | q4_0 | 4 | 3.5 GB |
|
19 |
+
| chatglm-ggml-q4_1.bin | q4_1 | 4 | 3.9 GB |
|
20 |
+
| chatglm-ggml-q5_0.bin | q5_0 | 5 | 4.3 GB |
|
21 |
+
| chatglm-ggml-q5_1.bin | q5_1 | 5 | 4.6 GB |
|
22 |
+
| chatglm-ggml-q5_1.bin | q8_0 | 8 | 6.6 GB |
|
23 |
+
|
24 |
+
|
25 |
+
# How to run in xorbits-inference
|
26 |
+
Coming soon.
|
27 |
+
|
28 |
+
# Slack
|
29 |
+
For further support, and discussions on these models and AI in general, join our [slack channel](https://join.slack.com/t/xorbitsio/shared_invite/zt-1o3z9ucdh-RbfhbPVpx7prOVdM1CAuxg)!
|
30 |
+
|
31 |
+
# Original model card: THUDM's chatglm 6B
|
32 |
+
ChatGLM-6B is an open bilingual language model based on General Language Model (GLM) framework, with 6.2 billion parameters. With the quantization technique, users can deploy locally on consumer-grade graphics cards (only 6GB of GPU memory is required at the INT4 quantization level). ChatGLM-6B uses technology similar to ChatGPT, optimized for Chinese QA and dialogue. The model is trained for about 1T tokens of Chinese and English corpus, supplemented by supervised fine-tuning, feedback bootstrap, and reinforcement learning with human feedback. With only about 6.2 billion parameters, the model is able to generate answers that are in line with human preference.
|
33 |
+
|
34 |
+
For more instructions, including how to run CLI and web demos, and model quantization, please refer to our [Github Repo](https://github.com/THUDM/ChatGLM-6B).
|