Spaces:

OALL
/

Open-Arabic-LLM-Leaderboard

Running on CPU Upgrade

App Files Files Community

FAQ

by alielfilali01 - opened May 14

Discussion

alielfilali01

Open Arabic LLM Leaderboard org May 14

Please feel free to ask all your questions here

MohamedRashad

May 14

The updating of the leaderboard is a little bit slow.
I submitted a model and it doesn't show in bending evaluations until now, (Nor any thing changes or moves)

alielfilali01

Open Arabic LLM Leaderboard org May 14

The updating of the leaderboard is a little bit slow.
I submitted a model and it doesn't show in bending evaluations until now, (Nor any thing changes or moves)

@MohamedRashad
There is some heavy models that are currently on eval in parallel and that's what blocking the leaderboard, we expect to see more Finished (more than 14) by tomorrow.
I checked and it seems that all the models in requests dataset are in the PENDING toggle under the "Submit here" tab, so apologies but i fail to understand what you meant in generally

MohamedRashad

May 14

I found the model i submitted now 😅

Everything is working great ^^

soufianechami

May 17

I know this might seem obvious to many users here, but some (myself included) still think the current leaderboard is the final evaluation.

Please make it clear to users that the ranking is not final—the evaluation is still ongoing.

Also, could you provide an estimated timeline for when the evaluation will be complete?

alielfilali01

Open Arabic LLM Leaderboard org May 18

Dear @soufianechami , Leaderboards by nature are never at a final state, models are coming eveyday and got submitted then evaluated respectively. In order to be up to date, you will have (it is a must) to check on the leaderboard every ounce and a while

rahimnathwani

May 21

I'm curious whether there will be a section for embedding models?

Huggingface has a leaderboard for embedding models (https://huggingface.co/spaces/mteb/leaderboard) but the scores and ranking are all based on English, Chinese, French and Polish.

It's hard to know which of the models may work well for Arabic, e.g. for building the retrieval part of a RAG system.

derek-thomas

Open Arabic LLM Leaderboard org May 27

@rahimnathwani you can find Arabic under STS -> Other

konstantindobler

Jun 1

Hi, thanks for compiling this resource!

Could you provide the exact lighteval command / config used for the evaluations? For example, in the ./examples/tasks/OALL.txt from the official lighteval repo, (almost) all tasks are evaluated 5-shot with |5|1 however in the leaderboard, everything is 0-shot.

Hamza-Alobeidli

Open Arabic LLM Leaderboard org Jun 4

Hi, thanks for compiling this resource!

Could you provide the exact lighteval command / config used for the evaluations? For example, in the ./examples/tasks/OALL.txt from the official lighteval repo, (almost) all tasks are evaluated 5-shot with |5|1 however in the leaderboard, everything is 0-shot.

Hello,
yes please only change all to |0|0
This is our setting.

ahmadelsallab

Aug 6

Hi,

Which dataset source is used in the ACVA benchmark?
This one: https://huggingface.co/datasets/FreedomIntelligence/ACVA-Arabic-Cultural-Value-Alignment/viewer/default/validation

Or this one: https://huggingface.co/datasets/OALL/ACVA

ahmadelsallab

Aug 6

Also, for AlGhafa benchmark, which dataset is used?

There are multiple datasets here: https://gitlab.com/tiiuae/alghafa/-/tree/main?ref_type=heads

Also, on the OALL/Datasets I can find:
https://huggingface.co/datasets/OALL/AlGhafa-Arabic-LLM-Benchmark-Translated

And:
https://huggingface.co/datasets/OALL/AlGhafa-Arabic-LLM-Benchmark-Native

So which one is used? And how is the final metric is calculated over the benchmark datasets?

ManojShack

Sep 12

•

edited Sep 13

Hi @alielfilali01 ,
I submitted finetuned adapter airev-ai/Amal-70b-v2.3.2 (base model - airev-ai/Amal-70b-v2) with bfloat16 precision couple of hours back. But I could see the status as failed in the requests card. Both adapter and base model are public, have added model card and even attached a valid license. I am kinda unsure why the model submission is failed. Any assistance in this matter would be greatly appreciated.

Thank you.

amztheory

Open Arabic LLM Leaderboard org Sep 18

Hi @ManojShack
Thanks for submitting your model to the leaderboard.
Regarding your concern, when attempting to evaluate your models we ran against errors that your models are missing config.json. Thus, ensure the config is included and submit again.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment