Edit model card

Just for fun, I tried to create an imatrix model for Kaiju-11B. (https://huggingface.co/Himitsui/Kaiju-11B)

I thought it wouldn’t work, since I only have a laptop with an Nvidia 3060 with 6GB of memory, but strangely enough, I was able to create a couple of models thanks to one script.

Here it is: https://huggingface.co/FantasiaFoundry/GGUF-Quantization-Script

According to the recommendations, my laptop was not suitable. I don’t know how it all works, maybe these were just recommendations to make quantization happen quickly. (it took me about an hour and a half to create the imatrix.bat file, but the quantization was fast) If anyone is interested, download and see how they work, this was done purely for fun, no more no less.

Downloads last month
58
GGUF
Model size
10.7B params
Architecture
llama

4-bit

5-bit

Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.