arxiv:2411.17991
Yuxuan Wang
ColorfulAI
AI & ML interests
Multimodal Learning
Recent Activity
commented
a paper
15 days ago
VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video
Comprehension with Video-Text Duet Interaction Format
updated
a model
about 2 months ago
ColorfulAI/NeedleInAVideoHaystack