Running on llama.cpp

#2
by YvonF - opened

When trying to run with llama.cpp
./llama.cpp/server --port 8002 --host 0.0.0.0 -m llama.cpp/models/Mistral-Nemo-Instruct-2407-Q5_K_M.gguf -c 128000
I got : error loading model: create_tensor: tensor 'blk.0.attn_q.weight' has wrong shape; expected 5120, 5120, got 5120, 4096, 1, 1

When trying to run with llama.cpp
./llama.cpp/server --port 8002 --host 0.0.0.0 -m llama.cpp/models/Mistral-Nemo-Instruct-2407-Q5_K_M.gguf -c 128000
I got : error loading model: create_tensor: tensor 'blk.0.attn_q.weight' has wrong shape; expected 5120, 5120, got 5120, 4096, 1, 1

llama.cpp not support this model yet.

They've just added support for the tokenizer a few hours ago, a few other things to go though.

They've just added support for the tokenizer a few hours ago, a few other things to go though.

in what release number is the support?

image.png

Second State org

The gguf models have already updated, which are based on llama.cpp b3438. If any further issue, please let us know. Thanks a lot!

Sign up or log in to comment