https://huggingface.co/sirmyrrh/Kyllima-34B-v1
I tried to do this myself and just couldn't figure it out.
because the tokenizer seems broken (or it hits a bug in llama.cpp). couldn't figure it out earlier today, either :)
Ugh, why. I already regenerated this model once. No matter what tokenizer source I use, the result seems to be broken. Thanks for trying! :) I'll look at it again and see if I can figure it out.
Maybe @nicoboss has an idea?
I know there used to be issues with the Yi 34B tokenizer that caused problems with GGUF conversion, but I thought they were fixed in llama.cpp a while ago. Both models in this merge were GGUFed successfully on their own, and I used one of them as the tokenizer source, so it seems like it should work. There are obviously subtleties here that I don't understand. ;P
For reference, here should be the relevant output:
INFO:hf-to-gguf:Set model tokenizer
WARNING:hf-to-gguf:ignore token 64001: id is out of range, max=63999
WARNING:hf-to-gguf:ignore token 64000: id is out of range, max=63999
WARNING:hf-to-gguf:replacing token 1: '<|startoftext|>' -> '<s>'
WARNING:hf-to-gguf:replacing token 2: '<|endoftext|>' -> '</s>'
Traceback (most recent call last):
File \"/root/cvs/llama.cpp/convert_hf_to_gguf.py\", line 4359, in <module>
main()
File \"/root/cvs/llama.cpp/convert_hf_to_gguf.py\", line 4353, in main
model_instance.write()
File \"/root/cvs/llama.cpp/convert_hf_to_gguf.py\", line 426, in write
self.prepare_metadata(vocab_only=False)
File \"/root/cvs/llama.cpp/convert_hf_to_gguf.py\", line 419, in prepare_metadata
self.set_vocab()
File \"/root/cvs/llama.cpp/convert_hf_to_gguf.py\", line 1507, in set_vocab
self._set_vocab_sentencepiece()
File \"/root/cvs/llama.cpp/convert_hf_to_gguf.py\", line 730, in _set_vocab_sentencepiece
tokens, scores, toktypes = self._create_vocab_sentencepiece()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File \"/root/cvs/llama.cpp/convert_hf_to_gguf.py\", line 799, in _create_vocab_sentencepiece
if toktypes[token_id] != SentencePieceTokenTypes.UNUSED:
~~~~~~~~^^^^^^^^^^
IndexError: list index out of range
job finished, status 1
I figured it out and got it to convert to GGUF finally. Thanks!
well, you didn't ask for it, but i'll make imatrix ones nevertheless :)
Thank you. It's much appreciated!