3bpw not working anymore

#1
by mpasila - opened

Earlier today I was able to run it without any problems but it seems like after TheBloke updated the Runpod Ooba Template to use OpenAI API instead it now gives me garbled nonsense when I try to generate text. I tried running it without HF loader just the ExllamaV2 but that didn't seem to help. I tried running another EXL2 model and that one worked without any problems. It's just this model that is having this weird tokenization problem.

https://github.com/turboderp/exllamav2/issues/123#issuecomment-1775406636 this might be related to my issue, the garbled text looks very similar to what I got.
Edit: I tested the other model you quantized and that one works but just not this one. So it might have that BOS token thing that's breaking it.

Hi there! I haven't tested today since I haven't been home, but it is rare that it stopped working after those updates, since those just added new samplers. Maybe there a change in ooba broke it? But at least the other works one for now, but I can't say with certain what the issue is for now, I'm sorry.

Ok so it seems like the problem was that when the client added a BOS token it would start giving garbage but when I disabled that it started behaving normally. (I think maybe the new OpenAI API added a BOS token by default or something that caused the problem. I'm not really sure since on SillyTavern I was able to just disable it, but on another client that wasn't even an option, yet it worked prior to that update. Though that client also just recently got updated and it works now.)

mpasila changed discussion status to closed

Having the same issue with textgen.

You are a helpful AI assistant.

USER: Hi
ASSISTANT:  Cord Domainumar domainsuo Cord Domainzor StringBuilder CordDomain Cordzor cordTagszor Cordzor Cord Cord Maj Cord repe Cord Cord Linearoch domain Cordzorzor Cord cord StringBuilder CordDomain DOM Domain sugar cord Cord Cordzor fixDomain

Having the same issue with textgen.

You are a helpful AI assistant.

USER: Hi
ASSISTANT:  Cord Domainumar domainsuo Cord Domainzor StringBuilder CordDomain Cordzor cordTagszor Cordzor Cord Cord Maj Cord repe Cord Cord Linearoch domain Cordzorzor Cord cord StringBuilder CordDomain DOM Domain sugar cord Cord Cordzor fixDomain

Did you try disabling the BOS token? Uncheck the box for "Add the bos_token to the beginning of prompts" in the Parameters tab.

Yes, that did the trick. Since I'm using the OpenAI extension I also had to set "add_bos_token=False" for the completion request.

Sign up or log in to comment