Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
How the n shot prompts are selected?
#39
by
SamG123
- opened
How are the n shots identified/selected? Is it same for all the models?
In the llm_leaderboard, it is mentioned that the n-shot prompt used for dataset ARC is 25. How are these 25 prompts identified? What are the n-shot prompts that are used for this leaderboard?
Hey, I had the same question and then read in HELM repo that you could get the prompts that were used using:
python write_out.py
--tasks all_tasks
--num_fewshot 5
--num_examples 10
--output_base_path /path/to/output/folder.
If you use that and add the arc dataset and change the num_fewshot to 25 that might be what was used.
SamG123
changed discussion status to
closed