GLM-4-9b-chat Quantized with AutoFP8

使用 m-a-p/COIG-CQIA 的 COIG_pc 集作为校准量化的 glm-4-9b-chat 模型。

主要为中文通常语言逻辑任务，为 vLLM 准备。

评估

项目	THUDM/glm-4-9b-chat	此项目	Recovery
ceval-valid	71.84	70.36	97.94%
cmmlu	72.23	70.42	97.49%
agieval_logiqa_zh (5 shots)	44.24	39.32	88.88%
平均	62.77	60.03	95.63%

Safetensors

Model size

9.4B params

Tensor type

BF16

F8_E4M3

Inference Examples

Inference API (serverless) does not yet support model repos that contain custom code.