togethercomputer/RedPajama-INCITE-Instruct-3B-v1 · Running the model on MPS backend (Macbook GPUs) hangs indefinitely

May 6, 2023

•

edited May 6, 2023

I'm trying to run the GPU example but with a Macbook M2 Max and it seems trying to use model.to("mps") simply hangs forever without any error message:

tokenizer = AutoTokenizer.from_pretrained(
    path)
model = AutoModelForCausalLM.from_pretrained(
    path, torch_dtype=torch.float16)

print("Device 0:", model.device) # prints: Device 0: cpu
mps_device = torch.device("mps")
print("Device 1:", mps_device)  # prints: Device 1: mps
model = model.to(mps_device)
print("Device 2:", model.device) # ^^ The above line hangs forever, this line is never reached.

Any thoughts on what I could do to get this to run with GPU inference?

PyTorch Version: 2.1.0.dev20230505

EugeneLu

May 8, 2023

What is the size of GPU memory for your M2 Max?

ggaabe

May 8, 2023

My bad. There was something probably misconfigured with my env though I have no idea what. I had used pip for all the env packages; I retried with conda and it all worked.