Marlin kernel in vLLM - new checkpoint?
#10
by
zoltan-fedor
- opened
Hi
@casperhansen
,
I have seen your tweet about the new Marlin kernel in vLLM making AWQ models much faster:
https://x.com/casper_hansen_/status/1814952968174678517?t=uaKsxU_LLB5SDP4CNQyokQ&s=19
Also saw your comment on the related PR in GitHub: https://github.com/vllm-project/vllm/pull/6612
Based on this, are you planning to add a new checkpoint for this model to support the Marlin kernel of vLLM?
Thanks!