Issues in Text-Generation-Wev-ui
So im running this model in text gen, and its partially working with llamaccp, but the models glitch out after about 6 tokens and start repeating the same words, and if you increase repeat penaly they will start spewing out random words. Any idea how to fix this?
Im seeing now that support for text gen is still slowly being worked on, in the mean time what command do we use for this in window for kobaldcpp?
The same as I show in the Readme, just you run koboldcpp.exe
instead of python koboldcpp.py
- the rest of the arguments should be the same
Ok i was able to get it to run, however still have the issue of the models glitch out after about 6 tokens and start repeating the same words, here is what im running on windows
koboldcpp.exe --stream --unbantokens --threads 8 --noblas vicuna-33b-1.3-superhot-8k.ggmlv3.q6_K.bin
Im running on cpu exclusively because i only have enough ram on cpu to run the model. Is there something im doing wrong that causing the glitch? what settings do you run on kobaldcpp so the model behaves normally?
You need to set --contextsize
, eg --contextsize 4096
. These SuperHOT models seem to perform very poorly at the default 2048 context, but are OK at higher context sizes.
This should be resolved in future as the context-increasing algorithm improves.
Honestly i dont know if im using the wrong version of koboldcpp.exe, but that program only allows you to generate up to 500 tokens, even with the --contextsize 4096 flag enabled. The version im using is here on windows:
https://github.com/LostRuins/koboldcpp
Hey i finally got it working, using the command lines to change the context size in koboldcpp doesnt work for generation, maybe it helps for loading the model but you have to do into settings, and select the actual number that represents Max_tokens and also amount_to_generate and set those manually to 8k and 4k respectively. thats the only way it works, even though the sliders dont go that far you can edit the numbers yourself and the models will work with it.
Ah yeah, I used to have a note about that in my READMEs but it's got lost somewhere along the way. I'll make sure to add it in future!