LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published 9 days ago • 94
ClinicalBench: Can LLMs Beat Traditional ML Models in Clinical Prediction? Paper • 2411.06469 • Published 14 days ago • 17
Sharingan: Extract User Action Sequence from Desktop Recordings Paper • 2411.08768 • Published 11 days ago • 9
AnimateAnything: Consistent and Controllable Animation for Video Generation Paper • 2411.10836 • Published 8 days ago • 18
SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory Paper • 2411.11922 • Published 6 days ago • 13