Quantized GGUF / EXL2 please?

by siddhesh22 - opened Mar 13

Discussion

siddhesh22

Mar 13

Possible to quantize this? Would appreciate it, only have 12GB VRAM.

BK-Lee

Owner Mar 13

This model was trained and validated under 4-bit quantization with bitsandbytes of which parametere has double quantization and nf4 format.

Actually, I did not have an experience to quantize this model by GGUF and EXL2 but only by bitsandbytes, but I think 12GB VRAM is too insufficient memory to my coding experience despite 4bit, because we must have additional four computer vision models.

Therefore, I think it may be impossible to run under only 12GB VRAM.

phuchm

Mar 15

@BK-Lee How much GB VRAM is enough to run in local?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment