Qwen1.5-0.5B-Chat-GGUF
Original Model
Run with LlamaEdge
LlamaEdge version: v0.2.15 and above
Prompt template
Prompt type:
chatml
Prompt string
<|im_start|>system {system_message}<|im_end|> <|im_start|>user {prompt}<|im_end|> <|im_start|>assistant
Context size:
32000
Run as LlamaEdge service
wasmedge --dir .:. --nn-preload default:GGML:AUTO:Qwen1.5-0.5B-Chat-Q5_K_M.gguf llama-api-server.wasm -p chatml
Run as LlamaEdge command app
wasmedge --dir .:. --nn-preload default:GGML:AUTO:Qwen1.5-0.5B-Chat-Q5_K_M.gguf llama-chat.wasm -p chatml
Quantized GGUF Models
Name | Quant method | Bits | Size | Use case |
---|---|---|---|---|
Qwen1.5-0.5B-Chat-Q2_K.gguf | Q2_K | 2 | 298 MB | smallest, significant quality loss - not recommended for most purposes |
Qwen1.5-0.5B-Chat-Q3_K_L.gguf | Q3_K_L | 3 | 364 MB | small, substantial quality loss |
Qwen1.5-0.5B-Chat-Q3_K_M.gguf | Q3_K_M | 3 | 350 MB | very small, high quality loss |
Qwen1.5-0.5B-Chat-Q3_K_S.gguf | Q3_K_S | 3 | 333 MB | very small, high quality loss |
Qwen1.5-0.5B-Chat-Q4_0.gguf | Q4_0 | 4 | 395 MB | legacy; small, very high quality loss - prefer using Q3_K_M |
Qwen1.5-0.5B-Chat-Q4_K_M.gguf | Q4_K_M | 4 | 407 MB | medium, balanced quality - recommended |
Qwen1.5-0.5B-Chat-Q4_K_S.gguf | Q4_K_S | 4 | 397 MB | small, greater quality loss |
Qwen1.5-0.5B-Chat-Q5_0.gguf | Q5_0 | 5 | 453 MB | legacy; medium, balanced quality - prefer using Q4_K_M |
Qwen1.5-0.5B-Chat-Q5_K_M.gguf | Q5_K_M | 5 | 459 MB | large, very low quality loss - recommended |
Qwen1.5-0.5B-Chat-Q5_K_S.gguf | Q5_K_S | 5 | 453 MB | large, low quality loss - recommended |
Qwen1.5-0.5B-Chat-Q6_K.gguf | Q6_K | 6 | 515 MB | very large, extremely low quality loss |
Qwen1.5-0.5B-Chat-Q8_0.gguf | Q8_0 | 8 | 665 MB | very large, extremely low quality loss - not recommended |
- Downloads last month
- 104
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for second-state/Qwen1.5-0.5B-Chat-GGUF
Base model
Qwen/Qwen1.5-0.5B-Chat