17 31 22

HAODONG DUAN

KennyUTC

https://kennymckormick.github.io

AI & ML interests

Video Understanding; Multi-Modal Learning

Recent Activity

updated a Space 6 days ago

opencompass/open_vlm_leaderboard

updated a dataset 17 days ago

VLMEval/OpenVLMRecords

authored a paper about 1 month ago

MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models

View all activity

Articles

Claude-3.5 Evaluation Results on Open VLM Leaderboard

Jun 24

• 5

RealWorldQA, What's New?

Apr 25

• 6

Organizations

Posts 2

Post

1449

OPEN VLM LEADERBOARD JUST RELEASED the FULL EVALUATION RESULTS of GPT-4o

[TL;DR]
GPT-4o shows steady progress compared to GPT-4v (0419), with a 3% improvement on the average score (68.7% -> 72.1%). GPT-4o displays stronger perception and less hallucination.

opencompass/open_vlm_leaderboard

Post

2564

Open VLM Leaderboard just updated the performance of GPT-4v (20240409), the new proprietary model ranked 1st across 50+ VLMs. Compared to the pervious version (20231106), the improvements on multimodal perception and reasoning are both huge.

Check the results:
opencompass/open_vlm_leaderboard

Papers 20

spaces 1

Running

🌖

BotChat

models

None public yet

datasets

None public yet