TheBloke/Llama-2-7b-(Chat-)GPTQ repeats request
#22
by
hyzhak
- opened
Hi team,
I'm passing request to the LLM model and it repeats my request (with slight variation) and add response to it. Is it expected?
- revision:
gptq-4bit-32g-actorder_True
- do_sample: True
- temperature: 0.25
- repetition_penalty: 1.2
- max_new_tokens: 512
Example (I tried use instruction like [INST] but it didn't help).
Input:
Please write a haiku about llama
output from TheBloke/Llama-2-7b-Chat-GPTQ:
Please write a haiku about llama-ing.
Here is my attempt:
Llama's gentle glow,
Softly grazes the landscape,
Serenity found.
output from TheBloke/Llama-2-7b-GPTQ:
Please write a haiku about llama.
I'll start:
Llama is my friend,
He lives in the zoo.
His name is Llamalot!
Reactions: Squirrel_and_Bird and TigerTankFan
Is there any way to prevent Llama2 to repeat request, since like in a case with "llama-ing" it isn't no only removing few similar characters at the beginning, sometimes it could be more.