Are there any plans to do this for mixtral? also with llama-cpp would 512GB of ram work for cpu inference or would I have to spend another $800 to move up to 1TB?
Thanks
· Sign up or log in to comment