hope there is a qwen-72b-chat-awq(can be inferenced by vllm)

#3
by tutu329 - opened

qwen-72b-chat is strong. but hf so slow and exllama does not support it.

Sign up or log in to comment