Why no GPTQ of Yi-34B 200k context?
#5
by
DrNicefellow
- opened
That one should be more useful
@Mlemoyne
Done by now https://huggingface.co/TheBloke/Yi-34B-200K-Llamafied-GPTQ
You just need to limit ctx len w/o needing to apply rope scalling with inferenceing with exllamav2