[FLAG] Garrulus and Turdus based models

#548
by MichaelKarpe - opened

Based on https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard/discussions/526:

  • https://huggingface.co/udkai/Turdus: "A less contaminated version" is still a contaminated version (thanks to the author for acknowledging it), even if "all 5-non Winograde metrics [...] to be 0.2% higher than the underlying model."

At the time of writing, all 7B models with better average score than https://huggingface.co/mlabonne/NeuralBeagle14-7B appear to be contaminated (thanks to the authors for their transparency regarding contamination):

Also a few with lower average score:

MichaelKarpe changed discussion title from [FLAG] udkai/Turdus to [FLAG] Turdus-based models
MichaelKarpe changed discussion title from [FLAG] Turdus-based models to [FLAG] Garrulus and Turdus based models

As I explained here: https://www.reddit.com/r/LocalLLaMA/comments/19acvq2/huge_issue_with_truthfulqa_contamination_and/ Turdus, among other models, Is not only contaminated from it's finetuning, but also from it's lineage. There is also license issue I explained (UPDATE: it looks like Turdus fixed their license).
I request assistance of volunteers as Leaderboard maintainer is not willing to take action and asked me to flag all the models, which I can't do because of rate limits (1 post/comment per 24 hours as new user).
Or maybe we can escalate this issue to other HF staff? This is getting out of hand.
I reported Turdus for contamination 2 days ago, and for license issue 1 day ago, but no action is taken.
Also I have a question, how come HuggingFaceH4/zephyr-7b-beta is under MIT license when parent model (Mistral) is under Apache-2?

EDIT: It seems like rate limit is finally lifted/increased, so I can post more.
What's the proper way - report models on their individual pages or make singe post to track them from one location?

Open LLM Leaderboard org

Hi, thanks @MichaelKarpe for the detailed report, I flagged the models you cited.
@ifjeakeiq The best way to report models for flagging is to open a discussion on the leaderboard with the name [FLAG] model_name
and add a description explaining why the model should be flagged. You can group multiple models in one discussion just like what @MichaelKarpe did. Thanks for your help ! :)

Open LLM Leaderboard org

Since @SaylorTwift tagged all relevant models, closing this issue.
Thanks a lot @MichaelKarpe !

clefourrier changed discussion status to closed

Sign up or log in to comment