TheBloke
/

Planner-7B-GGML

Model card Files Files and versions Community

Cannot load the model in Koboldcpp 1.28

by FenixInDarkSolo - opened Jun 6, 2023

Discussion

FenixInDarkSolo

Jun 6, 2023

•

edited Jun 6, 2023

I down the q4_0 and q8_0 models to test, but it cannot load in koboldcpp 1.28.
I have checked the SHA256 and confirm both of them are correct.
```
> koboldcpp_128.exe --threads 12 --smartcontext --unbantokens --contextsize 2048 --blasbatchsize 1024 --useclblast 0 0 --gpulayers 3
Welcome to KoboldCpp - Version 1.28
For command line arguments, please refer to --help
Otherwise, please manually select ggml file:
Attempting to use CLBlast library for faster prompt ingestion. A compatible clblast will be required.
Initializing dynamic library: koboldcpp_clblast.dll

Loading model: D:\program\koboldcpp\planner-7b.ggmlv3.q8_0.bin
[Threads: 12, BlasThreads: 12, SmartContext: True]

Identified as LLAMA model: (ver 5)
Attempting to Load...

llama_init_from_file: failed to load model
gpttype_load_model: error: failed to load model 'D:\program\koboldcpp\planner-7b.ggmlv3.q8_0.bin'
Load Model OK: False
Could not load model: D:\program\koboldcpp\planner-7b.ggmlv3.q8_0.bin

But I can successfully load it in the llama.cpp.

FenixInDarkSolo changed discussion title from Cannot load the model in Koboldcpp to Cannot load the model in Koboldcpp 1.28 Jun 6, 2023

TheBloke

Owner Jun 6, 2023

Shit. I hadn't realised that the new llama.cpp k-quant commit had changed q4_0, q4_1, q5_0, q5_1 and q8_0.

I happened to do this model 2 hours after the k-quant PR was merged (https://github.com/ggerganov/llama.cpp/pull/1684) so yeah the files only work with latest llama.cpp .

I am sure koboldcpp will add support pretty soon, but for now they won't work.

I'll see about re-doing them with previous llama.cpp. I think for the next week or two I'm going to do q4_0, q4_1, q5_0, q5_1 and q8_0 using llama.cpp before k-quant, and just do the new k-quant methods using the latest code. To ensure maximum compatibility.

HaroldB

Jun 6, 2023

I'm running into a similar issue with llama-cpp. Have not dug deep yet.

llama.cpp: loading model from ../python3.11/site-packages/llama_cpp/models/7B/planner-7b.ggmlv3.q5_0.bin
error loading model: unrecognized tensor type 14
llama_init_from_file: failed to load model

TheBloke

Owner Jun 6, 2023

Yeah sorry. Right now the files can only be used with latest llama.cpp.

I will re-generate them shortly

TheBloke

Owner Jun 6, 2023

I have updated all the old quant types: q4_0, q4_1, q5_0, q5_1, q8_0. They were now generated with an older version of llama.cpp

Please re-download, re-test and let me know.

MrDevolver

Jun 6, 2023

I have a custom automatic updater for Koboldcpp on Windows, if anyone is interested.

FenixInDarkSolo

Jun 7, 2023

I have download the q4_0 again and test it in koboldcpp 1.28. And it works. Thank you for the fix.

MrDevolver

Jun 7, 2023

KoboldCpp was updated to 1.29 recently and offers a partial support for k-quantizers. Partial because it only supports openblas for now, not Clblast yet. 😕

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Cannot load the model in Koboldcpp 1.28

Identified as LLAMA model: (ver 5)Attempting to Load...

Identified as LLAMA model: (ver 5)
Attempting to Load...