GPU Memory / RAM requierements

#19

by Rbn3D - opened May 12, 2023

May 12, 2023

How much GPU memory does this model require to run? And in CPU mode, how much RAM? I'm currently trying to run it on GPU with a GTX 1080 8Gb, and I'm getting a "cannot allocate memory" error, I suppose this requires at least 16gb or so.

Raspbfox

May 14, 2023

I would assume it takes about ~15 GBs of VRAM without any optimizations! However, you can very successfully run it on a CPU with 5-bit quantization with just ~5.3 GBs of RAM taken!

Raspbfox

May 14, 2023

In theory, you might be able to run it in bfloat16 mode, but I don't know how, sry.

danieldaugherty

May 22, 2023

@Raspbfox I searched far and wide for a quantization example, but couldn't find one... =[

Raspbfox

May 22, 2023

@danieldaugherty , just try searching for the GGML quantized models (usually q5_1) or GPTQ 👀

danieldaugherty

May 22, 2023

Ah yeah, I found that. But I didn't really understand how to use it...

danieldaugherty

May 22, 2023

GPTQ doesn't support MPT yet =[

skshreyas714

May 23, 2023

When you run this MPT-7B model in FP16 then it would consume 14 GB of GPU memory. So you would need atleast 16 GB of GPU memory to run this model for Inference

abhi-mosaic

Jun 3, 2023

Closing as stale.

Also noting that we added device_map support as of this PR: https://huggingface.co/mosaicml/mpt-7b-instruct/discussions/41

abhi-mosaic changed discussion status to closed Jun 3, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment