Generation never stop until hits the limit at bpw 6.5 .
Maybe it's a quantitative issue?
This is an odd model that spits out the unused token instead of the stop string, I found that adding that to my custom stop tokens helped to bring it in line
Also there's some odd config issues that need to be solved for getting it running nicely in text generation webui using the HF loader, if that's what you're using let me know. If you're using just the raw exllamav2, then it's probably that I need to include an added_tokens.json with the proper im_start and im_end token mappings, though even then it'll still output UNUSED_TOKEN_145 as its EOS token, so I think you just need to specify that as a stop token
@bartowski
Thanks for your answer, but I am using tabbyAPI, please tell me the details how to solve this problem.
(Sorry, I'm not very familiar with this configuration.)
https://github.com/InternLM/InternLM/issues/642
https://github.com/InternLM/InternLM/issues/654
https://github.com/InternLM/InternLM/issues/662
This is a possibly related question raised by others on the original model release page.
I'm sorry I don't know how to solve it, but you can refer to it.
Maybe it will help solve the problem.
Ah okay, doesn't look like there's a custom stop string there..
Can you try adding a file called added_tokens.json with this content:
https://github.com/oobabooga/text-generation-webui/issues/5375#issuecomment-1927859570
It might not fix it, but I'm curious if it will.
I added the file so you can download it: https://huggingface.co/bartowski/internlm2-chat-7b-llama-exl2/blob/6_5/added_tokens.json
@bartowski
Thank you very much for your help, but still can't stop.
The recurring <|im_start|> and <|im_end|> in the dialogue disappeared.
Instead [UNUSED_TOKEN_145] appeared again
Ah goodie... So it's just worse π₯² dam, I wish I knew what was going on..
Can you check the token ID that it spits out where it SHOULD be stopping? And see if maybe that's not 2?
oh okay @Pevernow I managed to look into it and it looks like with tabby you can specify a stop token:
so if it uses a specific token every time it should be stopping, you can probably set it there and then it'll add it as a stop condition
Unfortunately, I changed it and it didn't do anything. Perhaps the sample file was not enabled correctly.
But I found another unique method, which solved this problem perfectly by resetting the eos_token_id in the model's config.json to the id corresponding to <|im_end|>.
Thanks again for helping me, please change my solution to the repository so others don't get stuck with this issue again.
This is probably the most powerful small Chinese model that I can find at the moment. It is very helpful for my research on Chinese role-playing, so I don't want to give up this model anyway, even if this model has some difficulty in stopping before.
Thank you. ππ₯°ββ€οΈ