File size: 1,322 Bytes
6a15afe 2029954 287e410 be5a22c 287e410 3a4a78e 2029954 6a15afe 77c1fdd |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
---
title: JudgerBench Leaderboard
emoji: 🌎
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 5.1.0
app_file: app.py
pinned: true
license: apache-2.0
tags:
- leaderboard
short_description: 'JudgerBench Leaderboard'
---
In this leaderboard, we display all evaluation results obtained with VLMEvalKit. The space provides an overall leaderboard, consisting of a curated selection of benchmarks and the overall score; as well as the benchmark-level leaderboards that provides the overall and fine-grained scores for each single benchmark.
Github: https://github.com/open-compass/VLMEvalKit
Report: https://arxiv.org/abs/2407.11691
Please consider to cite the report if the resource is useful to your research:
```BibTex
@misc{duan2024vlmevalkitopensourcetoolkitevaluating,
title={VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models},
author={Haodong Duan and Junming Yang and Yuxuan Qiao and Xinyu Fang and Lin Chen and Yuan Liu and Amit Agarwal and Zhe Chen and Mo Li and Yubo Ma and Hailong Sun and Xiangyu Zhao and Junbo Cui and Xiaoyi Dong and Yuhang Zang and Pan Zhang and Jiaqi Wang and Dahua Lin and Kai Chen},
year={2024},
eprint={2407.11691},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2407.11691},
}
``` |