Spaces:
Running
on
CPU Upgrade
e5-R-mistral-7b for retrieval, apply for refreshing the results
Hi @tomaarsen @Muennighoff ,
We submitted a new model, BeastyZ/e5-R-mistral-7b. Could you please refresh the space?
Thanks!
Beasty
Seems like it already shows up likely via the automatic refresh. Curious why the result parsing did not work though - was it because of the (default)
? Did that get added by the script in the MTEB repo?
Yes, (default)
was added by the script in the MTEB repo. I have now manually deleted (default)
and am waiting for the next automatic refresh.
Oh, that seems like a bug as it should work out of the box - ill need to double check - cc @KennethEnevoldsen in case you know; seems related to changes in the meta script
@BeastyZ
did you create the meta data using the CLI mteb create_meta ...
? If so it should work (otherwise we have a bug to fix)
Hmm right, looking at the code it also seems like it is an error on our end. @Muennighoff we should probably allow for "(default)" for consistency with the other subsets. WDYT?
we should probably allow for "(default)" for consistency with the other subsets.
It seems like we can either
(a) Change the leaderboard code to allow default
. The problem here is that we do not want (default)
to appear in the leaderboard table I think as it is not very useful, but we want languages to appear. So we would have to manually replace it somewhere in the code. Probably adds a line or two to the LB code here: https://github.com/embeddings-benchmark/leaderboard/blob/bef8d2ff6b420db179018d2a2689207aad180449/refresh.py#L325. The question is do we want it to appear in the Evaluation results
sidebar of models? It also seems not super useful there so maybe no need but then this solution would not be desirable.
(b) Change the mteb code not to add default
to the name like here. Adds one line here: https://github.com/embeddings-benchmark/mteb/blob/778d7a3bf85b2023cc8ba9b2c35a810dcfa5e924/mteb/cli.py#L298. This is how it has worked thus far.
I don't have a strong preference but given that default
is not very useful info in the sidebar/metadata (note that it is still recorded in the config field just not shown in the name) & its how it has worked thus far, I'd go with (b). But happy to be disagreed with! :)
I manually deleted the default
24 hours ago, but my model, e5-R-mistral-7b, still hasn't appeared on the retrieval leaderboard. Why is that?
The latest refresh failed: https://github.com/embeddings-benchmark/leaderboard/actions/runs/9884681390
Apologies while we work out the kinks of the new automatic refresh.
cc
@KennethEnevoldsen
@orionweller
this is regarding the PawsXPairClassification (fr)
key not being found.
- Tom Aarsen
Yes, sorry about this @BeastyZ ! Pushing a fix now
Making an issue on the leaderboard Github to consolidate this issue: https://github.com/embeddings-benchmark/leaderboard/issues/8
@KennethEnevoldsen
@Muennighoff
@orionweller
@tomaarsen
Thank you for your timely and kind help! Things are moving in a positive direction. I only want to add my model into the retrieval leaderboard. Many scores appear, but Average
and CQADupstackRetrieval
are not among them.
Hey @BeastyZ ! In that Github issue I referenced earlier I pointed this out and tagged you there (I thought, perhaps I got the wrong Github handle). I agree it's an issue!
Is it okay if we move the discussion there? We're trying to move away from using the Spaces for PRs/discussion.
FWIW
@BeastyZ
the issue is you don't seem to have main_score
for MTEB CQADupstackRetrieval
. I think you need to aggregate them.
Will close this issue again and refer to to https://github.com/embeddings-benchmark/leaderboard/issues/8