`std::runtime_error: [Matmul::eval_cpu] Currently only supports float32`
#2
by
adhishthite
- opened
Hello,
I am getting this output on my Mac M1 Pro on Sonoma 14.3. Python version used is 3.11
. Used latest PyTorch.
Please visit this link for full output: https://app.warp.dev/block/NMbYuCAkwfcxcQ7zjhZv8n
In [5]: response = generate(model, tokenizer, prompt="<step>Source: user Fibonacci series in Python<step> Source: assistant Destination: user", verbose=True)
...:
...:
==========
Prompt: <step>Source: user Fibonacci series in Python<step> Source: assistant Destination: user
libc++abi: terminating due to uncaught exception of type std::runtime_error: [Matmul::eval_cpu] Currently only supports float32.
[1] 9782 abort ipython
How much RAM in your M1 Pro? This model requires a machine with at least 64GB to run with q4
@ivanfioravanti I have the M1 Pro 16GB, but still got this error with a 7B Q4 quantized model.
Please see my github issue if you know how to fix it. Many thanks
https://github.com/ml-explore/mlx/issues/753