Request for pt weights

#8
by skt7 - opened

Hi @TheBloke , recently vLLM has started supporting GPTQ models, this is the backend we use for serving our models, we are looking forward to using this model but seems that they only support "pt" weights as of now and seems like this repo only has "safetensors", any chance you can share the "pt" weights, while they roll out support for "safetensors"?

Here is the PR with active discussion on the same : https://github.com/vllm-project/vllm/pull/2028

we use safetensors with vLLM atm, dunno if this helps

python3 -u -m vllm.entrypoints.openai.api_server
--host 0.0.0.0
--model $model
--tensor-parallel-size $num_shard
--load-format safetensors
--quantization gptq
--dtype float16

I think I will close this as this was just a request and the original issue I had was resolved in the further vllm releases.

skt7 changed discussion status to closed

Sign up or log in to comment