About Quantization
我们使用modelscope swift仓库进行AWQ 4bit量化. 量化文档可以查看这里. 量化命令如下:
We use the modelscope swift repository to perform AWQ 4bit quantization. Quantization documentation can be found here. The quantization command is as follows:
# Experimental Environment: A100
swift export \
--quant_bits 4 \
--model_type yi-1_5-34b-chat \
--quant_method awq \
--quant_n_samples 32 \
--dataset alpaca-zh alpaca-en sharegpt-gpt4-mini \
--quant_seqlen 4096
Inference:
CUDA_VISIBLE_DEVICES=0 swift infer --model_type yi-1_5-34b-chat-awq-int4
SFT:
CUDA_VISIBLE_DEVICES=0 swift sft --model_type yi-1_5-34b-chat-awq-int4 --dataset leetcode-python-en
Original Model:
- Downloads last month
- 167
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.