8 2 7

Chunyuan Li

Chunyuan24

https://chunyuan.li/

AI & ML interests

None yet

Recent Activity

liked a Space about 2 months ago

Tonic/Llava-Video

commented a paper about 2 months ago

Video Instruction Tuning With Synthetic Data

authored a paper about 2 months ago

LLaVA-Critic: Learning to Evaluate Multimodal Models

View all activity

Organizations

Chunyuan24's activity

liked a Space about 2 months ago

Running on Zero

🌋📹

Llava Video

interact with videos !

commented a paper about 2 months ago

Video Instruction Tuning With Synthetic Data

Paper • 2410.02713 • Published Oct 3 • 37 •

authored 2 papers about 2 months ago

LLaVA-Critic: Learning to Evaluate Multimodal Models

Paper • 2410.02712 • Published Oct 3 • 34

Video Instruction Tuning With Synthetic Data

Paper • 2410.02713 • Published Oct 3 • 37

upvoted 2 papers about 2 months ago

Video Instruction Tuning With Synthetic Data

Paper • 2410.02713 • Published Oct 3 • 37

LLaVA-Critic: Learning to Evaluate Multimodal Models

Paper • 2410.02712 • Published Oct 3 • 34

authored a paper 2 months ago

MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines

Paper • 2409.12959 • Published Sep 19 • 36

authored a paper 3 months ago

SAM2Point: Segment Any 3D as Videos in Zero-shot and Promptable Manners

Paper • 2408.16768 • Published Aug 29 • 26

authored a paper 4 months ago

LLaVA-OneVision: Easy Visual Task Transfer

Paper • 2408.03326 • Published Aug 6 • 59

authored 2 papers 5 months ago

LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models

Paper • 2407.07895 • Published Jul 10 • 40

Long Context Transfer from Language to Vision

Paper • 2406.16852 • Published Jun 24 • 32

authored a paper 6 months ago

MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding

Paper • 2406.09411 • Published Jun 13 • 18

authored a paper 11 months ago

TrustLLM: Trustworthiness in Large Language Models

Paper • 2401.05561 • Published Jan 10 • 65

authored a paper 12 months ago

LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models

Paper • 2312.02949 • Published Dec 5, 2023 • 11

authored 6 papers about 1 year ago

Visual In-Context Prompting

Paper • 2311.13601 • Published Nov 22, 2023 • 16

LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents

Paper • 2311.05437 • Published Nov 9, 2023 • 47

LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing

Paper • 2311.00571 • Published Nov 1, 2023 • 40