how to load the model with multiple GPUs
#5
by
Sven00
- opened
I have not found a guidance on how to load the model and run inference with multiple GPUs. The instructions provided by mosaicML covers only a single GPU. Thank you
having the same issues. You can load the model by setting device_map = "auto", which distributes the memory across GPUs (does not speed up) but still having issues with inference
@abhi-mosaic maybe you can help us out here?
Having the same issues with inference - model loads fine on multiple GPUs, but inference is very very slow. Any updates?