It no worky anymore
Log start
main: build = 3136 (f2b5764b)
main: built with cc (GCC) 14.1.1 20240507 (Red Hat 14.1.1-1) for x86_64-redhat-linux
main: seed = 1719392023
GGML_ASSERT: ggml.c:21470: 0 <= info->type && info->type < GGML_TYPE_COUNT
zsh: IOT instruction (core dumped) ./main -m ~/Downloads/bitnet_b1_58-3B.q2_2.gguf -p "my name is"
Log start main: build = 3136 (f2b5764b) main: built with cc (GCC) 14.1.1 20240507 (Red Hat 14.1.1-1) for x86_64-redhat-linux main: seed = 1719392023 GGML_ASSERT: ggml.c:21470: 0 <= info->type && info->type < GGML_TYPE_COUNT zsh: IOT instruction (core dumped) ./main -m ~/Downloads/bitnet_b1_58-3B.q2_2.gguf -p "my name is"
Bitnet PR integrated, works in CPU mode (only recent models Bitnet properly converted work, older ones do not).
Example : https://huggingface.co/BoscoTheDog/bitnet_b1_58-xl_q8_0_gguf/tree/main
I saw this in a koboldcpp fork with the bitnet pr merged
Yes, I am using newest version I did a git pull and recompiled? It says merged
I assume something changed though
You could try this,
BoscoTheDog/bitnet_b1_58-xl_q8_0_gguf
It works for me using this koboldcpp fork
Edit: Just to rule it out, this repo crashes for me too. Something must have changed
The pr that got merged got merged without general model support, that means it only added internal support for q2_2 tensors, but not models.
I updated the repo with new files created from the continuation by compilade here: https://github.com/ggerganov/llama.cpp/pull/8151
Be sure to give them a thumbs up for their work :)