How many GPUs for this model to run this fast?

#8
by edureisMD - opened

How many GPUs for this model to run this fast?

Apologies for hijacking your thread but do you know how to load the model with multiple GPUs?

I don't have a definite answer to your question but since the model size is around 60 GB you will probably need around 70-80 GB of RAM to run it fast.

Hi Sven, thank you, I'm using something similar to this comment: https://huggingface.co/mosaicml/mpt-30b-chat/discussions/7
Just letting BitsAndBytesConfig split it in multiple GPUs, but don't know if this is the right or more efficient way would love to learn.

thank you, unfortunately i'm not able to run BitsAndBytes but does it enable tensor parallelism as described in the following thread? https://huggingface.co/mosaicml/mpt-7b-instruct/discussions/23

Sign up or log in to comment