judgerbench_leaderboard

Running

File size: 1,322 Bytes

6a15afe
2029954
287e410
 
 
 
be5a22c
287e410
 
 
 
3a4a78e
2029954
6a15afe
 
77c1fdd

---
title: JudgerBench Leaderboard
emoji: 🌎
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 5.1.0
app_file: app.py
pinned: true
license: apache-2.0
tags:
- leaderboard
short_description: 'JudgerBench Leaderboard'
---


In this leaderboard, we display all evaluation results obtained with VLMEvalKit. The space provides an overall leaderboard, consisting of a curated selection of benchmarks and the overall score; as well as the benchmark-level leaderboards that provides the overall and fine-grained scores for each single benchmark.

Github: https://github.com/open-compass/VLMEvalKit
Report: https://arxiv.org/abs/2407.11691

Please consider to cite the report if the resource is useful to your research:

```BibTex
@misc{duan2024vlmevalkitopensourcetoolkitevaluating,
      title={VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models}, 
      author={Haodong Duan and Junming Yang and Yuxuan Qiao and Xinyu Fang and Lin Chen and Yuan Liu and Amit Agarwal and Zhe Chen and Mo Li and Yubo Ma and Hailong Sun and Xiangyu Zhao and Junbo Cui and Xiaoyi Dong and Yuhang Zang and Pan Zhang and Jiaqi Wang and Dahua Lin and Kai Chen},
      year={2024},
      eprint={2407.11691},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2407.11691}, 
}
```