What's the prompt template?
Is it just "Q:... \n\nA:..."?
We have no fixed template, but you can generally use the following structure and adapt it to your needs:
Task Definition...
Input: xxx
Output: yyy
Input: xxx
Output: yyy
Input: xxx
Output: yyy
...
@mauriceweber thank you maurice! it helps but unfortunately the instruct model considers input from the examples as input for my input where i need the output. do you know how to prevent this behaviour?
Can you show me an example of the prompts you are designing?
@mauriceweber sure! Here you go:
"In this task, you are given the abstract of a research paper. Your task is to generate a summary of this abstract. Your summary should not be very short, but it's better if it's not more than 30 words.
Input: ....
Output: ....
Input: {my text}
Output:"
The problem is, that the model leaks "Input: ...." in {my text} because I can see parts of the one-shot example in the model outputt
Maybe I need to add something like ### RESPONSE: at the end?
thank you
I see, so if I'm understanding it correctly, then the problem is that the model keeps generating input / outputs until the max length is reached.
I think what you can do in this case is implement a stopping criterion-- e.g., as soon as "Input:" is generated, you stop the generation process. You can find some examples about how to implement this in this thread: https://discuss.huggingface.co/t/implimentation-of-stopping-criteria-list/20040.
Apologies for the misunderstanding but my problem goes into a different direction. For example if the fictive prompt would be:
"In this task, you are given the abstract of a research paper. Your task is to generate a summary of this abstract. Your summary should not be very short, but it's better if it's not more than 30 words.
Input: Blablabla Apples blalbabla
Output: The text focuses on apples.
Input: Blablabla Oranges blalbabla
Output:"
In this case i get the following output from the model: "The text focuses and apples and oranges."
Do you understand how the one-shot examples leaks into the input of the actual input for the model?
Oh now I understand, thanks for the clarification. In that case I think it is a question of optimizing the prompt template. Have you tried to prompt the model in other ways?
For example, what I imagine could work better than the above template is something like this:
Summarize the following document.
<your abstract goes here>
Summary:
Indeed this template works better since it does not include any examples. I assumed the few-shot is always better compared to zero-shot prompting. However, have you also noticed that changing the temperature has no impact on the output? For example with the template proposed by you I get a nice summary but it is extractive (i.e., not re-written/rephrased or diverse in any sense). How can I prompt the model into a direction of writing a novel summary in a narrative style? Or while focusing on specific topics in the document? Thank you
You can try to play more with the template above -- for example if you want a narrative style you can include that instruction into your prompt template. The same goes for the topics you want the model to focus on.
Thanks, I've tried out but the output is pretty unstable... well, I will go on with fine tuning the base model. thank you