kennymckormick
commited on
Commit
•
9c85353
1
Parent(s):
4ceef72
update BLINK
Browse files- meta_data.py +7 -0
meta_data.py
CHANGED
@@ -217,3 +217,10 @@ LEADERBOARD_MD['SEEDBench2'] = """
|
|
217 |
- SEEDBench2 comprises 24K multiple-choice questions with accurate human annotations, which spans 27 dimensions, including the evaluation of both text and image generation.
|
218 |
- Note that we only evaluate and report the part of model's results on the SEEDBench2.
|
219 |
"""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
217 |
- SEEDBench2 comprises 24K multiple-choice questions with accurate human annotations, which spans 27 dimensions, including the evaluation of both text and image generation.
|
218 |
- Note that we only evaluate and report the part of model's results on the SEEDBench2.
|
219 |
"""
|
220 |
+
|
221 |
+
LEADERBOARD_MD['BLINK'] == """
|
222 |
+
## BLINK Test Evaluation Results
|
223 |
+
|
224 |
+
- BLINK is a benchmark containing 14 visual perception tasks that can be solved by humans “within a blink”, but pose significant challenges for current multimodal large language models (LLMs).
|
225 |
+
- We evaluate BLINK on the test set of the benchmark, which contains 1901 visual questions in multi-choice format.
|
226 |
+
"""
|