Spaces:
Running
on
CPU Upgrade
Validation set results
Hi,
Are the results available just on test sets? How can I access validation set results, if any?
Many thanks!
Hi!
We only run evaluation on either the validation set, the test set, or both, depending on the configuration for the task in the Eleuther AI Harness (the task table is here).
You can access the details of each model by clicking on the page icon after their name :)
I'm seeking access to the results for all the test set examples for all the models.
Is there a designated location where the computed results are cached?
basically if we have backup for files saved by this line: https://github.com/EleutherAI/lm-evaluation-harness/blob/master/lm_eval/evaluator.py#L397
All the results are accessible in the details datasets - for example, here are files for the Yi model: https://huggingface.co/datasets/open-llm-leaderboard/details_01-ai__Yi-34B/tree/main/2023-11-08T19-46-38.378007