Edit model card

CybersurferNyandroidLexicat-8x7B-iMat-GGUF

CybersurferNyandroidLexicat quantized from fp16 with love.

Uses the same imat calculation method as the later batch of maid-yuzu-v8-alter-iMat-GGUF.

Legacy quants (i.e. Q5_K_M, Q6_K, etc) in this repo have all been enhanced with importance matrix calculation. These quants show improved KL-Divergence over their static counterparts.

All files included here for your convenience. No need to clone the entire repo, just pick the quant that's right for you.

For more information on latest iMatrix quants see this PR - https://github.com/ggerganov/llama.cpp/pull/5747

Tip: The letter at the end of the quant name indicates its size. Larger sizes have better quality, smaller sizes are faster.

  • IQ3_XS - XS (Extra Small)
  • IQ3_S - S (Small)
  • IQ3_M - M (medium)
Downloads last month
23
GGUF
Model size
46.7B params
Architecture
llama

2-bit

3-bit

4-bit

5-bit

6-bit

Inference API
Unable to determine this model's library. Check the docs .

Collection including InferenceIllusionist/CybersurferNyandroidLexicat-8x7B-iMat-GGUF