Edit model card

GGUF importance matrix (imatrix) quants for https://huggingface.co/abacusai/Smaug-72B-v0.1
The importance matrix was trained for 100K tokens (200 batches of 512 tokens) using wiki.train.raw.

Update 2024-03-14:

  • New quant IQ1_S using latest commit 4755afd1.

Update 2024-03-02:

  • New quants IQ2_S/IQ2_M, requires commit a33e6a0d or later.
  • The importance matrix was trained for ~50K tokens (105 batches of 512 tokens) using a general purpose imatrix calibration dataset.
  • This is a different calibration dataset than the previous quants I posted so we can compare the quality

Llama-2 conversation template and system prompt set to the Qwen system prompt.

Layers Context Template
80
32768
[INST] <<SYS>>
{instructions}
<</SYS>>

{prompt} [/INST]
{response}
Downloads last month
176
GGUF
Model size
72.3B params
Architecture
llama

1-bit

2-bit

3-bit

Inference Examples
Unable to determine this model's library. Check the docs .