Must be a quant

by vanakema - opened Sep 16

MLX Community org Sep 16

The size of this model is the same size as the 4 bit quantization. Are you sure this is not a 4 bit quantization that's just being "expanded" to bf16 at runtime? The size of the pretrained model mlx-community/Meta-Llama-3.1-70B-bf16 is several times larger than this instruct model.

shangodbole

MLX Community org Sep 21

This comment has been hidden

shangodbole

MLX Community org Sep 21

•

edited Sep 21

Config file says this:

"quantization": {
"group_size": 64,
"bits": 4
},..

It appears to be a 4 bit quantized model.

vanakema

MLX Community org Sep 21

If I wanted to upload a proper bf16 MLX model of this to replace this one, how would I go about that? I know I could make my own model repo, but how do I go about getting the permission to update this incorrect one?

shangodbole

MLX Community org Sep 21

Not sure, but you can probably can join the community (https://huggingface.co/mlx-community) and upload the correct bf16 model ( + New) with a slightly different name indicating that it is the correct one.

prince-canuma

MLX Community org Oct 6

Fixed ✅

prince-canuma changed discussion status to closed Oct 6

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment