Edit model card

With fixes applied for:

  • 3.1 rope scaling factors (#8676)
  • llama-bpe as tokenizer Proper Llama 3.1 Support in llama.cpp (#8650)
  • <|python_tag|> works for tool calls.

Following files are fixed and others are being replaced.

  • Meta-Llama-3.1-8B-Instruct.Q4_K_M.gguf
  • Meta-Llama-3.1-8B-Instruct.IQ4_XS.gguf

REF

https://github.com/ggerganov/llama.cpp/issues/8650 https://github.com/ggerganov/llama.cpp/pull/8676

Downloads last month
660
GGUF
Model size
8.03B params
Architecture
llama

2-bit

3-bit

4-bit

5-bit

8-bit

Inference Examples
Inference API (serverless) is not available, repository is disabled.