Spaces:

mteb
/

leaderboard

Running on CPU Upgrade

App Files Files Community

144

Restart the space for new models

#28

by infgrad - opened Sep 10, 2023

Discussion

infgrad

Sep 10, 2023

Hi, @Muennighoff
Thanks for the great work!
I submitted two new Chinese Text Embedding models: "stella-base-zh" and "stella-large-zh" , can you help restart this space?

Thanks!

Muennighoff

Massive Text Embedding Benchmark org Sep 11, 2023

Done! Congrats on the strong performance! cc @Jinkin

bayang

Sep 25, 2023

Hi, @Muennighoff , thanks for taking the time to restart the leaderboard cache. Can you help this time refreshing the leaderboard?

I have another query:

How can I do have also a German leaderboard for MTEB like the CH and PO language that you have on the GH repo?

Muennighoff

Massive Text Embedding Benchmark org Sep 25, 2023

Dones!

We can add a German leaderboard but there's not many german datasets at this point so I would wait for more first

bayang

Sep 25, 2023

Okay, understood. I would try to generate a German dataset using translation.

If we consider the knowledge distillation technics for text representation of other languages, do you think it's worth it for the semantic search task? @Muennighoff

Muennighoff

Massive Text Embedding Benchmark org Sep 26, 2023

If you do human translation it's fine; A high quality machine translation may be OK, too (BEIR-PL is machine-translated).

By semantic search do you mean Retrieval?

bayang

Sep 26, 2023

Yes, exactly for the retrieval process.

Then to correct the rank of the result to the user query, I use a cross-encoder which works perfectly.
But sometimes, the default pre-trained model worked well with the German text, with some complicated sentences, it performed badly.

That is why I thought to use knowledge distillation.

Muennighoff

Massive Text Embedding Benchmark org Sep 28, 2023

I see. Not sure, maybe it could work

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment