Benchmarks - a hppdqdq Collection

Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

hppdqdq 's Collections

Benchmarks

updated 26 days ago

Running on CPU Upgrade

162

🥇

MMLU Pro

More advanced and challenging multi-task evaluation
Running

28

🎭

Stick To Your Role! Leaderboard
Running

44

📊

ZeroEval Leaderboard
Running

22

🥇

Decentralized Arena Leaderboard
Running on CPU Upgrade

293

🥇

Open Medical-LLM Leaderboard
Running

108

🏆

GPU Poor LLM Arena

Compact LLM Battle Arena: Frugal AI Face-Off!
Running

84

🌎

Open VLM Video Leaderboard

VLMEvalKit Eval Results in video understanding benchmark
Running on CPU Upgrade

11.8k

🏆

Open LLM Leaderboard 2

Track, rank and evaluate open LLMs and chatbots

Collection guide
Browse collections

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs