Out of memory

#5
by huyn1608 - opened

I tried to load the model (4.0bpw) to an A100 80GB GPU but failed due to the OOM issue. How can I load the model on multiple GPU using exllama2?

Thank you so much.

huyn1608 changed discussion status to closed

Sign up or log in to comment