能分享下怎么用gptq量化成int4的吗?
#7
by
loong
- opened
这边参考官网量化qwen,报错RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
model.cuda() 即可
https://github.com/PanQiWei/AutoGPTQ/issues/370#issuecomment-1766913012
看这里,autogptq内部做device迁移的时候有些不完善的地方。
jklj077
changed discussion status to
closed