Recreating MMLU scores

by theblackcat102 - opened

Do you guys use lm-evaluation-harnesss for MMLU evaluation? I'm not getting the stark improvement found in fineweb-edu image using this checkpoint.

HuggingFaceFW org

We do not. I've added a note to the top of this file detailing how you can reproduce our setup:

@guipenedo Thanks for the quick reply, I will check it out

theblackcat102 changed discussion status to closed

Sign up or log in to comment