Memory requirement
Hello, I’m trying to load the model in my server with 4 A4500 with 20G memory each, but always get oom error.
What’s the suggested memory requirement?
Hey are trying to load it for training or inference? The model itself is ~96GB. Loading in 4 bit works for inference on a single a a6000 card. I'm going to be pushing an update to the modeling_gemmoe file here in a couple of hours and that should make things a bit more stable.
Hey are trying to load it for training or inference? The model itself is ~96GB. Loading in 4 bit works for inference on a single a a6000 card. I'm going to be pushing an update to the modeling_gemmoe file here in a couple of hours and that should make things a bit more stable.
I see. Could you also give us the sample code of loading model correctly?
Yeah I'm currently writing up a whole document with all the code/info. Will be out by this evening.