Vision+LLM - a Chaolei Collection

Chaolei 's Collections

Archs

NeRF

Adapting/Distilling

AIGC

Vision+LLM

updated Nov 12, 2023

LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents

Paper • 2311.05437 • Published Nov 9, 2023 • 47
On the Road with GPT-4V(ision): Early Explorations of Visual-Language Model on Autonomous Driving

Paper • 2311.05332 • Published Nov 9, 2023 • 9
SoundCam: A Dataset for Finding Humans Using Room Acoustics

Paper • 2311.03517 • Published Nov 6, 2023 • 10