when I run the 4bit model in vllm. I got following error. I am using rtx4090 (24G GPU memory).
Please make sure no other processors occupying the GPU memory.24G memory is enough to run 34B-4bits version
@weiminw You might want to manully set ctx len to 4096
· Sign up or log in to comment