Spaces:

open-llm-leaderboard
/

open_llm_leaderboard

Running on CPU Upgrade

App Files Files Community

1016

There seems to be a problem with the mixtral finetuning evaluations

#491

by DavidGF - opened Dec 20, 2023

Discussion

DavidGF

Dec 20, 2023

I have seen that some mixtral finetunes have failed the evaluation.
This also includes our finetunes. Does anyone know why this could be?
Currently almost all successfully evaluated mixtral models have the sliding window set or had it set at the time of evaluation, which is actually not correct, but could it be because of that?

Regards,
David

deleted

Dec 21, 2023

@DavidGF I was wondering why Mixtrals weren't showing up on the leaderboard. Out of curiosity can you provide links to some that failed?

DavidGF

Dec 21, 2023

@DavidGF I was wondering why Mixtrals weren't showing up on the leaderboard. Out of curiosity can you provide links to some that failed?

Hey Phil337,

deleted

Dec 21, 2023

@DavidGF Thanks! That list includes a couple I was waiting on, especially Dolphin.

clefourrier

Open LLM Leaderboard org Dec 21, 2023

Hi!
Thanks for the detailed report!
We changed the cluster on which the leaderboard backend is running and it would appear it has problems connecting to the hub - we are investigating and going to fix asap

SaylorTwift

Open LLM Leaderboard org Dec 21, 2023

Hi ! Some models failed while we migrated the leaderboard backend. I will requeue all those models. Thanks for the notice :)

SaylorTwift changed discussion status to closed Dec 21, 2023

DavidGF

Dec 21, 2023

I'm sorry to reopen this topic, but there still seems to be some problem, especially with the instruct models:

DavidGF changed discussion status to open Dec 21, 2023

clefourrier

Open LLM Leaderboard org Dec 22, 2023

Hi,
FYI, the new cluster is having strong connectivity problems, we are putting all evals on hold til it's fixed, and we'll relaunch all FAILED evals of the past 2 days

Dampfinchen

Dec 23, 2023

OpenPipe/mistral-ft-optimized-1218
cognitivecomputations/dolphin-2.6-mixtral-8x7b

Those models have issues getting into the evaluation que as well.

DavidGF

Dec 25, 2023

Thanks for fixing!

DavidGF changed discussion status to closed Dec 25, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment