Safetensors model file name
Many thanks for merge.
Working on quanting to gguf, seems that for single file models only "model.safetensors" is currently supported: https://github.com/ggerganov/llama.cpp/blob/fbbc42827b2949b95bcde23ce47bb47d006c895d/convert-hf-to-gguf.py#L180 .
Commenting out 180-181 is working for now, initial output looks good
Many thanks for merge.
Working on quanting to gguf, seems that for single file models only "model.safetensors" is currently supported: https://github.com/ggerganov/llama.cpp/blob/fbbc42827b2949b95bcde23ce47bb47d006c895d/convert-hf-to-gguf.py#L180 .
Commenting out 180-181 is working for now, initial output looks good
🙏thanks
ok I will rename it
hey, can you quantize this model? https://huggingface.co/NousResearch/Obsidian-3B-V0.5 I haven't seen any yet and the projector if possible
have used https://huggingface.co/nisten/obsidian-3b-multimodal-q6-gguf before.
Ran into issues last i tried quanting myself, will try with latest llama.cpp commit tomorrow
Quants up here https://huggingface.co/afrideva/Echo-3B-GGUF
thank you, I'm trying to make a 5b stablelm model here https://huggingface.co/Aryanne/testing-only I quantized locally and at the inference I got an error about graph, idk if it's a problem with the .jsons, my quantization or the model itself, can you take a look?
this is the error
GGML_ASSERT: ggml.c:15158: cgraph->n_nodes < cgraph->size Aborted
maybe 58 was too much layers😅
maybe 58 was too much layers😅
Apparently so, just tried it with latest llama.cpp and got the same error... Your Zephyr-3.43B looks good though, quanting now