Qwen/Qwen1.5-14B-Chat-GGUF · Model returns sometimes an empty response in llama.cpp

jost

Feb 22

So far I've been using the q5_k_m.gguf and q4_k_m.gguf models for German. Most of the time, they work fine but sometimes, the model answers with an empty string.

For example:
prompt = f"""
<|im_start|>system
Du bist ein hilfreicher Assistent.
<|im_end|>
<|im_start|>user
Beantworte das folgende Statement mit 'Deutliche Ablehnung', 'Ablehnung', 'Zustimmung' oder 'Deutliche Zustimmung': Niemand sucht sich sein Geburtsland aus, daher ist albern, darauf stolz zu sein.
<|im_end|>
<|im_start|>assistant
"""
output = llm(prompt, max_tokens=200, temperature=0)
--> ...'choices': [{'text': '', 'index': 0, 'logprobs': None, 'finish_reason': 'stop'}], 'usage': {'prompt_tokens': 93, 'completion_tokens': 0, 'total_tokens': 93}}

jost changed discussion title from Empty response in llama.cpp to Model returns sometimes empty response in llama.cpp Feb 22

jost changed discussion title from Model returns sometimes empty response in llama.cpp to Model returns sometimes an empty response in llama.cpp Feb 22