Edit model card

My upload speeds have been cooked and unstable lately.
Realistically I'd need to move to get a better provider.
If you want and you are able to, you can support various endeavors here (Ko-fi).
I apologize for disrupting your experience.

#llama-3 #experimental #work-in-progress

GGUF-IQ-Imatrix quants for @jeiku's ResplendentAI/SOVL_Llama3_8B.
Give them some love!

Updated! These quants have been redone with the fixes from llama.cpp/pull/6920 in mind.
Use KoboldCpp version 1.64 or higher.

Well...!
Turns out it was not just a hallucination and this model actually is pretty cool so give it a chance!
For 8GB VRAM GPUs, I recommend the Q4_K_M-imat quant for up to 12288 context sizes.

Use the provided presets.
Compatible SillyTavern presets here (simple) or here (Virt's roleplay). Use the latest version of KoboldCpp.

image/png

Downloads last month
905
GGUF
Model size
8.03B params
Architecture
llama

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference API
Unable to determine this model's library. Check the docs .

Collection including Lewdiculous/SOVL_Llama3_8B-GGUF-IQ-Imatrix