lamma cpp ht to gguf not working

#2
by RameshRajamani - opened

INFO:hf-to-gguf:Loading model: 046a9891f7a2b94706a0ba5c1b93c6c835000f15

issue.jpg
ERROR:hf-to-gguf:Model LlavaForConditionalGeneration is not supported. Is there any way that I can convert this to gguf ...I wanted to create a API locally.

RameshRajamani changed discussion title from python convert_hf_to_gguf.py models--mistral-community--pixtral-12b\snapshots\046a9891f7a2b94706a0ba5c1b93c6c835000f15 --outfile pixtral.gguf to lamma cpp ht to gguf not working

Transformers doesn't even support Pixtral yet, so it stands to reason that llama.cpp doesn't either.

If you need to run locally right now, you can use MistralAI's official repository with vLLM.

I imagine the reason you wanted to convert to GGUF to begin with is to quantize it; vLLM can quantize automatically at load time.

Unofficial Mistral Community org

Transformers does support pixtral, not sure I understand your comment https://github.com/huggingface/transformers/pull/33449 πŸ€—

In the dev version sure, but not the release version that llama.cpp points to. I could've been more specific about that.

To be more precise:

"llama.cpp doesn't support Pixtral in any meaningful capacity yet, and the version of Transformers that llama.cpp's conversion script depends on is too early. Once Transformers v45 is fully released, we can then expect llama.cpp to implement support for Pixtral, but likely not sooner."

Sign up or log in to comment