Generate a bunch of crap?
Very strange, running the following example produces an incorrect statement?
./main -ngl 35 -m Calme-4x7B-MoE-v0.2-GGUF.Q4_K_M.gguf --color -c 32768 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant"
As a model trained on
20: Question: newlines:
- and the most commonly known as an individual to:
- A large amountingredactually,
INSTARTIme in598746/ "The question:As a humanoidentitled by the of AI is designed to #A20 on an important AI generated from areservaries an open AIr an dtookaying a instantiq an individual a, which includes one oftheone of the with are known as being trained, a key in many be a trainAI
INST at the to a commona t a important an unnamed byof theare designed here ares Ais an important individualizes theare more often is the, a 'd the one is commonly referred the one are all ofthe most important a are being a spacefor 5: The H. Here a varietyn's are not only to a common theare uncommona the period4s T o Nan A and0be to be 1 ( the are
to F2 I, known 7 33re " is a # an E for Manning 9 This array 6A 5h some, with the American 8n The 2. As 3 were designed by 4, its own 20 to 91 are a T-like
Nieline known Train Commonalysis train in your trainable 9
various trains
Yes, there is an issue with the quantization of MoE models coming from Mergekit by Llama.cpp. (the actual model works fine, so as the pf16 GGUF)
I am following it up in Llama.cpp, hopefully they can fix/support MoE quantization in Llama.cpp soon and I will re-upload the models. (I will soon upload the pf16 that works)
Maybe it has something to do with this.
Good find! I've been looking for mergekit related issues, and there are lots of work in progress.
However, this PR seems to be merged already, and I quantized these models with a built from 2 days ago. Unfortunately, it seems I already used the PR and it is still not working. I'll keep looking for an appropriate issue/PR to chime in, if I couldn't find any, I'll make one.
Maybe it has something to do with this.
Good find! I've been looking for mergekit related issues, and there are lots of work in progress.
However, this PR seems to be merged already, and I quantized these models with a built from 2 days ago. Unfortunately, it seems I already used the PR and it is still not working. I'll keep looking for an appropriate issue/PR to chime in, if I couldn't find any, I'll make one.
It seems to be the same problem, I don’t know how he solved it.
https://huggingface.co/zhengr/MixTAO-7Bx2-MoE-v8.1/discussions/3
https://huggingface.co/MaziyarPanahi/MixTAO-7Bx2-MoE-v8.1-GGUF/discussions/3
Maybe it has something to do with this.
Good find! I've been looking for mergekit related issues, and there are lots of work in progress.
However, this PR seems to be merged already, and I quantized these models with a built from 2 days ago. Unfortunately, it seems I already used the PR and it is still not working. I'll keep looking for an appropriate issue/PR to chime in, if I couldn't find any, I'll make one.
It seems to be the same problem, I don’t know how he solved it.
https://huggingface.co/zhengr/MixTAO-7Bx2-MoE-v8.1/discussions/3
https://huggingface.co/MaziyarPanahi/MixTAO-7Bx2-MoE-v8.1-GGUF/discussions/3
So the issue is that my quantization from his model, it actually works:
So this could mean the Llama.cpp either worked at some point with MoE models and stopped in the very recent changes (which I always pull and make a new build), or the way he made that MoE is very different. I see if I can ask him how he made the MoE, was it just a mergekit via hidden gates, or maybe he did something extra.