Spaces:

ggml-org
/

gguf-my-repo

Running on A10G

Add F16 and BF16 quantization

#129

by andito HF staff - opened 6 days ago

←

andito

6 days ago

No description provided.

ngxson

ggml.ai org 3 days ago

The problem with adding BF16 is that current we use convert_hf_to_gguf.py to convert HF model into F16, then use llama-quantize to quantize it.

So the conversion will be safetensors --> F16 --> BF16 which adds no benefit to the output model.

What we should do here is also modify the code that run convert_hf_to_gguf.py, so it outputs directly BF16 GGUF file

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment