Transformers
GGUF
English
Not-For-All-Audiences
Inference Endpoints

smaug-bpe error

#1
by xxx31dingdong - opened

Why is the llama model using smaug-bpe instead of llama-bpe? Ooba is not updated for it yet. Is there a fix for it or does the quants have to be reuploaded?
Edit: (I did convert it to llama-bpe to make it work for now, "python gguf-new-metadata.py --pre-tokenizer llama-bpe input output")

It's because the model this is quantized from is using the smaug-bpe pretokenizer. It does not need fixing (on the quant side), because overriding it with the wrong one will break the model (even if it then runs because you are using outdated software). If you think the model should not use that tokenizer config, you would need to report this to the model creator - I only provide the quants.

mradermacher changed discussion status to closed

I've checked the original model and it definitely uses the smaug and not the llama-3 tokenizer config.

Sign up or log in to comment