Can't load model?
#2
by
nacs
- opened
I've tried the latest llama.cpp (as of August 11th) as well as the commit mentioned in the README
( commit e76d630
) but am getting an error when trying to load this model.
I've tried the q2_k.bin
as well as q3_K_S.bin
and both give the following error:
llama_model_load_internal: using CUDA for GPU acceleration
ggml_cuda_set_main_device: using device 0 (NVIDIA GeForce RTX 3060) as main device
error loading model: llama.cpp: tensor 'layers.0.attention.wk.weight' has wrong shape; expected 8192 x 8192, got 8192 x 1024
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model './models/airoboros-l2-70b-gpt4-2.0.ggmlv3.q2_K.bin'
main: error: unable to load model
Am I doing something wrong?
Sorry, this was my fault. I missed the part of the README that says to use " -gqa 8
" when using this model. It works now thanks.
nacs
changed discussion status to
closed