It doesn't work with Exllama at the moment
#1
by
Shouyi987
- opened
Probably because of different architecture:
RuntimeError: shape '[1, 74, 64, 128]' is invalid for input of size 75776
Output generated in 0.00 seconds (0.00 tokens/s, 0 tokens, context 75, seed 909967695)
I solved this error by updating to the latest version of the transformers library.
Yes, please update to the latest Transformers Github code to fix compatibility with AutoGPTQ and GPTQ-for-LLaMa. ExLlama won't work yet I believe.
pip3 install git+https://github.com/huggingface/transformers
I have updated the README to reflect this. I should have added it last night, but I didn't get these uploaded until 4am and I forgot.