Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -96,4 +96,3 @@ The model achieves the following results without any fine-tuning (zero-shot):
 To get these results, we used the Eleuther AI evaluation harness [here](https://github.com/EleutherAI/lm-evaluation-harness),
 which can produce results different than those reported in the GPT2 paper. The p-values come from the stderr from the evaluation harness, plus a normal distribution assumption.
-We chose these 20 tasks, because they are the tasks that the GPT2 and GPT3 papers report results for.


96
97	To get these results, we used the Eleuther AI evaluation harness [here](https://github.com/EleutherAI/lm-evaluation-harness),
98	which can produce results different than those reported in the GPT2 paper. The p-values come from the stderr from the evaluation harness, plus a normal distribution assumption.