multiple_choice_score: there are 12032 tasks in prompt | |
multiple_choice_score: reading tasksmultiple_choice_score: failed to read task 1 of 12032 | |
multiple_choice_score: there are 12032 tasks in prompt | |
multiple_choice_score: reading tasksmultiple_choice_score: failed to read task 1 of 12032 | |