ask
#12
by
ReD2401
- opened
Hello TheBloke, I hope you are well.
I thank you for all your effort O)/
I'm happy to take a look. There are some complications with GPT-J models. llama.cpp can't load them, and the latest and best 4bit quantisation code for CPU doesn't work with them. It would be possible to use them in GPT4ALL-Chat though, or there's a CLI version that also supports GPT-J for CPU.
GPTQ 4bit for GPU should be possible, but not as well supported.
I will give it a go and let you know!
thank you so much i really appreciate it
Why did you edit the model out of your comment? Do you not want me to look at it any more? Or did someone else do it?