zfj1998 commited on
Commit
54427b3
β€’
1 Parent(s): df168cd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -8,7 +8,7 @@ pinned: false
8
  ---
9
  ### HumanEval-V: A Lightweight Visual Understanding and Reasoning Benchmark for Evaluating LMMs through Coding Tasks
10
 
11
- <p align="center"> <a href="">πŸ“„ Paper</a> β€’ <a href="https://humaneval-v.github.io">🏠 Home Page</a> β€’ <a href="https://github.com/HumanEval-V/HumanEval-V-Benchmark">πŸ’» GitHub Repository</a> β€’ <a href="https://humaneval-v.github.io/#leaderboard">πŸ† Leaderboard</a> β€’ <a href="https://huggingface.co/datasets/HumanEval-V/HumanEval-V-Benchmark">πŸ€— Dataset</a> β€’ <a href="https://huggingface.co/spaces/HumanEval-V/HumanEval-V-Benchmark-Viewer">πŸ€— Dataset Viewer</a> </p>
12
 
13
  **HumanEval-V** is a novel and lightweight benchmark designed to evaluate the visual understanding and reasoning capabilities of Large Multimodal Models (LMMs) through coding tasks. The dataset comprises **108 entry-level Python programming challenges**, adapted from platforms like CodeForces and Stack Overflow. Each task includes **visual context that is indispensable to the problem**, requiring models to perceive, reason, and generate Python code solutions accordingly.
14
 
 
8
  ---
9
  ### HumanEval-V: A Lightweight Visual Understanding and Reasoning Benchmark for Evaluating LMMs through Coding Tasks
10
 
11
+ <p align="center"> <a href="https://arxiv.org/abs/2410.12381">πŸ“„ Paper</a> β€’ <a href="https://humaneval-v.github.io">🏠 Home Page</a> β€’ <a href="https://github.com/HumanEval-V/HumanEval-V-Benchmark">πŸ’» GitHub Repository</a> β€’ <a href="https://humaneval-v.github.io/#leaderboard">πŸ† Leaderboard</a> β€’ <a href="https://huggingface.co/datasets/HumanEval-V/HumanEval-V-Benchmark">πŸ€— Dataset</a> β€’ <a href="https://huggingface.co/spaces/HumanEval-V/HumanEval-V-Benchmark-Viewer">πŸ€— Dataset Viewer</a> </p>
12
 
13
  **HumanEval-V** is a novel and lightweight benchmark designed to evaluate the visual understanding and reasoning capabilities of Large Multimodal Models (LMMs) through coding tasks. The dataset comprises **108 entry-level Python programming challenges**, adapted from platforms like CodeForces and Stack Overflow. Each task includes **visual context that is indispensable to the problem**, requiring models to perceive, reason, and generate Python code solutions accordingly.
14