When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training Paper • 2411.13476 • Published 1 day ago • 4
SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory Paper • 2411.11922 • Published 4 days ago • 12
VideoAutoArena: An Automated Arena for Evaluating Large Multimodal Models in Video Analysis through User Simulation Paper • 2411.13281 • Published 1 day ago • 15
VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models Paper • 2411.13503 • Published 1 day ago • 22
SageAttention2 Technical Report: Accurate 4 Bit Attention for Plug-and-play Inference Acceleration Paper • 2411.10958 • Published 5 days ago • 32
AnimateAnything: Consistent and Controllable Animation for Video Generation Paper • 2411.10836 • Published 5 days ago • 18
LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published 6 days ago • 87
Drowning in Documents: Consequences of Scaling Reranker Inference Paper • 2411.11767 • Published 3 days ago • 16
Comprehensive and Practical Evaluation of Retrieval-Augmented Generation Systems for Medical Question Answering Paper • 2411.09213 • Published 8 days ago • 6
SlimLM: An Efficient Small Language Model for On-Device Document Assistance Paper • 2411.09944 • Published 7 days ago • 12
BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile Devices Paper • 2411.10640 • Published 6 days ago • 37
Sharingan: Extract User Action Sequence from Desktop Recordings Paper • 2411.08768 • Published 8 days ago • 9
ClinicalBench: Can LLMs Beat Traditional ML Models in Clinical Prediction? Paper • 2411.06469 • Published 11 days ago • 17
MagicQuill: An Intelligent Interactive Image Editing System Paper • 2411.09703 • Published 7 days ago • 50
LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models Paper • 2411.09595 • Published 7 days ago • 65
Large Language Models Can Self-Improve in Long-context Reasoning Paper • 2411.08147 • Published 9 days ago • 58
view article Article Releasing the largest multilingual open pretraining dataset By Pclanglais • 8 days ago • 94