Submitted by akhaliq 17 InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding · 18 authors 1
Submitted by akhaliq 14 Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance · 8 authors 2
Submitted by akhaliq 12 ThemeStation: Generating Theme-Aware 3D Assets from Few Exemplars · 5 authors 1
Submitted by akhaliq 11 SiMBA: Simplified Mamba-Based Architecture for Vision and Multivariate Time series · 2 authors 1
Submitted by akhaliq 9 DragAPart: Learning a Part-Level Motion Prior for Articulated Objects · 4 authors 1
Submitted by akhaliq 8 FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions · 8 authors 1
Submitted by akhaliq 6 AllHands: Ask Me Anything on Large-scale Verbatim Feedback via Large Language Models · 15 authors 2