is this one fit for vllm deployment?

by hulianxue - opened Jul 31

Jul 31

as title.

I want to use it in my project. is it fit for vllm environment deployment?

seems it cost more gpu than other ones which have same size(70b) and same quant type (gptq), which could lead to GPU OOM issue in a H100 or A100 device.

Respair

Aug 5

doesn't seem so, I couldn't run it on a dual gpu of 64gb of vram.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment