Must be a quant

#2
by vanakema - opened
MLX Community org

The size of this model is the same size as the 4 bit quantization. Are you sure this is not a 4 bit quantization that's just being "expanded" to bf16 at runtime? The size of the pretrained model mlx-community/Meta-Llama-3.1-70B-bf16 is several times larger than this instruct model.

MLX Community org
This comment has been hidden
MLX Community org
edited Sep 21

Config file says this:

"quantization": {
"group_size": 64,
"bits": 4
},..

It appears to be a 4 bit quantized model.

MLX Community org

If I wanted to upload a proper bf16 MLX model of this to replace this one, how would I go about that? I know I could make my own model repo, but how do I go about getting the permission to update this incorrect one?

MLX Community org

Not sure, but you can probably can join the community (https://huggingface.co/mlx-community) and upload the correct bf16 model ( + New) with a slightly different name indicating that it is the correct one.

MLX Community org

Fixed ✅

prince-canuma changed discussion status to closed

Sign up or log in to comment