What is the required amount of VRAM for running it?
#5
by
boohwooh
- opened
I will run this on runpod.io for test. The base model (llama-2 70b) is 120Gb but this is more than 300Gb. What is the required amount of VRAM for running it?
I'm running it quantized to 4bits (with a typical load_in_4bit bitsandbytes) and it's using 46.511GB.
I'm running it quantized to 4bits (with a typical load_in_4bit bitsandbytes) and it's using 46.511GB.
Thank you. I was curious about the base model vram requirements.
boohwooh
changed discussion status to
closed