Text Generation
Transformers
PyTorch
mpt
Composer
MosaicML
llm-foundry
custom_code
text-generation-inference

How much GPU memory is needed to finetune MPT-7B Instruct model?

#31
by skshreyas714 - opened

I benchmarked this model for Sentiment Classification but the performance was very poor. So I want to finetune this model for a Multilingual Sentiment Classification dataset. Wanted to know the GPU requirements for finetuning it in FP16 mode.

Closing as stale. As noted above, to finetune with FP32 weights, and FP32 LionW optimizer state, and FP32 gradients, it would take about 7 * 4 * 3 = 84GB total memory.

abhi-mosaic changed discussion status to closed

Sign up or log in to comment