lamma cpp ht to gguf not working
INFO:hf-to-gguf:Loading model: 046a9891f7a2b94706a0ba5c1b93c6c835000f15
ERROR:hf-to-gguf:Model LlavaForConditionalGeneration is not supported. Is there any way that I can convert this to gguf ...I wanted to create a API locally.
Transformers doesn't even support Pixtral yet, so it stands to reason that llama.cpp doesn't either.
If you need to run locally right now, you can use MistralAI's official repository with vLLM.
I imagine the reason you wanted to convert to GGUF to begin with is to quantize it; vLLM can quantize automatically at load time.
Transformers does support pixtral, not sure I understand your comment https://github.com/huggingface/transformers/pull/33449 π€
In the dev version sure, but not the release version that llama.cpp points to. I could've been more specific about that.
To be more precise:
"llama.cpp doesn't support Pixtral in any meaningful capacity yet, and the version of Transformers that llama.cpp's conversion script depends on is too early. Once Transformers v45 is fully released, we can then expect llama.cpp to implement support for Pixtral, but likely not sooner."