YorkieOH10/granite-34b-code-instruct-Q8_0-GGUF

May 7

$ ~/workspace/llama.cpp/server -m ./granite-34b-code-instruct-gguf/granite-34b-code-instruct.Q8_0.gguf -c 8192 --host 0.0.0.0 --port 8501 -ngl 81 -t 10 --mlock

It runs failed with message:

llama_model_load: error loading model: check_tensor_dims: tensor 'output.weight' not found
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model './granite-34b-code-instruct-gguf/granite-34b-code-instruct.Q8_0.gguf'
{"tid":"0x1f67f3ac0","timestamp":1715049990,"level":"ERR","function":"load_model","line":685,"msg":"unable to load model","model":"./granite-34b-code-instruct-gguf/granite-34b-code-instruct.Q8_0.gguf"}

YorkieOH10

Owner May 7

@davideuler Yep, I'm pretty sure it's an issue with llama.cpp not yet supporting the IBM granite models.

davideuler

May 7

@davideuler Yep, I'm pretty sure it's an issue with llama.cpp not yet supporting the IBM granite models.

Thanks, hope it can be supported soon.

YorkieOH10

Owner May 8

There is a feature request to get support added open: https://github.com/ggerganov/llama.cpp/issues/7116

YorkieOH10 changed discussion status to closed May 8

YorkieOH10
/

granite-34b-code-instruct-Q8_0-GGUF

Failed to load model