It seems that the results of some recent evaluation tasks have not been uploaded

#394
by Azure99 - opened

Hi, @clefourrier
It seems that the status of some recent evaluation tasks has been completed, but their results have not been uploaded.
I observed that the latest update of https://huggingface.co/datasets/open-llm-leaderboard/results/tree/main was 15 hours ago. However, the running status in https://huggingface.co/datasets/open-llm-leaderboard/requests/tree/main is still being updated.

If it’s not too much trouble, could you take a look?

Open LLM Leaderboard org
edited Nov 21, 2023

Hi :)
Of course! Do you have a specific model you want me to take a look at?
Also cc: @SaylorTwift

@clefourrier
Thank you for your reply, the model I want to check is Azure99/blossom-v3-mistral-7b.
In addition, I mean that almost all tasks are in the completion state, but the results have not been uploaded. This may be an unexpected error.

Open LLM Leaderboard org

Hi!
Thank you very much! Yes, it would seem from the logs that we have a problem when pushing the results! We'll try to fix it asap!

Open LLM Leaderboard org

(But I can confirm your model was evaluated properly)

This is really great news, thank you again for your work.

Open LLM Leaderboard org

I pushed the results of all currently evaluated models manually, while we're working on a fix

Hi @clefourrier , sorry to bother you again.
Could you manually push the evaluation results of the models once more?Many models have been evaluated in the past day, particularly Microsoft's Orca-2, and I think many people would be interested in that.

Open LLM Leaderboard org

Hi! I just did :)
Are you sure microsoft's model has been evaluated however? I'm not seeing it in the results, I suspect it has not finished running yet

@clefourrier Thank you for your prompt response.
I apologize, it seems I made an error – the Orca model is still under evaluation. Looking forward to the leaderboard being corrected appropriately.

Open LLM Leaderboard org

(Tagging @SaylorTwift re-results not uploaded, btw)

Open LLM Leaderboard org

Hi ! The issue has been fixed, sorry for the confusion, you should see results uploaded again. Don't hesistate to re-open if you have any issue :)

SaylorTwift changed discussion status to closed

@clefourrier @SaylorTwift It seems that the same problem has occurred again. The model has been evaluated, but the results have not been uploaded.
https://huggingface.co/datasets/open-llm-leaderboard/results/commits/main

Azure99 changed discussion status to open
Open LLM Leaderboard org

Hi!
Thank you for reporting! Just did a manual upload of the results.

Azure99 changed discussion status to closed

Yes, it's here again. It seems that a batch of tasks has failed recently. Can you take another look?
Azure99/blossom-v3_1-yi-34b
@clefourrier @SaylorTwift
Thank you again.

Azure99 changed discussion status to open
Open LLM Leaderboard org

Hi @Azure99 , can you point us to the request file please?

hi @clefourrier , it's here.
/Azure99/blossom-v3_1-yi-34b_eval_request_False_bfloat16_Original.json
https://huggingface.co/datasets/open-llm-leaderboard/requests/commit/32cdbd36e7352b1bbed8fcc55dc5ea8b7b81382b
By the way, the recent evaluations of several 34B models have all failed, but evaluations of models with sizes like 7B are normal.

Open LLM Leaderboard org

Hi! There was a connectivity issue when trying to read your model - I added it back to pending.

clefourrier changed discussion status to closed

Hi @clefourrier , just like before, another task unexpectedly failed, could you help with restarting it? Thank you very much.
https://huggingface.co/datasets/open-llm-leaderboard/requests/commit/2a2c89c16442b63af92aed7a68eea23d3073c7ee

Azure99 changed discussion status to open
Open LLM Leaderboard org
edited Jan 4

Hi @Azure99 , the model failed to be loaded - can you upload it in safetensors as required on the Submit page please? :)
Once it's uploaded, I'll relaunch it.

Open LLM Leaderboard org

(Two points as a side note, to simplify our work next time: please point to the request file directly, not to the commit of the request file, and please open a new discussion instead of reopening old ones, so that we can sort them by order of priority more easily. That would really simplify things for us :) )

Open LLM Leaderboard org

Closing for inactivity, feel free to reopen if needed

clefourrier changed discussion status to closed

Sign up or log in to comment