npc0
/

chatglm3-6b-int8

Model card Files Files and versions Community

npc0 commited on Nov 1, 2023

Commit

4e9692c

•

1 Parent(s): e06ecc3

Create README.md

Files changed (1) hide show

README.md +41 -0

README.md ADDED Viewed

	@@ -0,0 +1,41 @@

+---
+language:
+- zh
+- en
+tags:
+- glm
+- chatglm
+- ggml
+---
+# ChatGLM3-6B-int8
+介绍 (Introduction)
+ChatGLM3-6B 是 ChatGLM 系列最新一代的开源模型，[THUDM/chatglm3-6b](https://huggingface.co/THUDM/chatglm3-6b)
+用 [ChatGLM.CPP]() 基於 GGML quantize 生成 Q8_0 權重 weights 儲存於此倉庫。
+## Performance
+|Model                 |GGML quantize method| HDD size |1 token\*|
+|----------------------|--------------------|----------|---------|
+|chatglm3-ggml-q8_0.bin|        q8_0        |  6.64 GB |  114ms  |
+\* ms/token (CPU @ Platinum 8260) from [reference](https://github.com/li-plus/chatglm.cpp#performance)
+## Getting Started
+1. Install dependency
+  ```sh
+  pip install chatglm-cpp transformers
+  ```
+2. Download weight
+  ```sh
+  wget https://huggingface.co/npc0/chatglm3-6b-int8/resolve/main/chatglm3-ggml-q8_0.bin
+  ```
+3. Code
+  ```py
+  import chatglm_cpp
+  pipeline = chatglm_cpp.Pipeline("./chatglm3-ggml-q8_0.bin")
+  pipeline.chat(["你好"])
+  # Output: 你好👋！我是人工智能助手 ChatGLM3-6B，很高兴见到你，欢迎问我任何问题。
+  ```