All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages Paper • 2411.16508 • Published 3 days ago • 7
GMAI-VL & GMAI-VL-5.5M: A Large Vision-Language Model and A Comprehensive Multimodal Dataset Towards General Medical AI Paper • 2411.14522 • Published 7 days ago • 29
O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson? Paper • 2411.16489 • Published 3 days ago • 28
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge Paper • 2411.16594 • Published 3 days ago • 29
BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games Paper • 2411.13543 • Published 8 days ago • 17
Large Multi-modal Models Can Interpret Features in Large Multi-modal Models Paper • 2411.14982 • Published 6 days ago • 13
OminiControl: Minimal and Universal Control for Diffusion Transformer Paper • 2411.15098 • Published 6 days ago • 38
TÜLU 3: Pushing Frontiers in Open Language Model Post-Training Paper • 2411.15124 • Published 6 days ago • 52
Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models Paper • 2411.14432 • Published 7 days ago • 19
Hymba: A Hybrid-head Architecture for Small Language Models Paper • 2411.13676 • Published 8 days ago • 37
Multimodal Autoregressive Pre-training of Large Vision Encoders Paper • 2411.14402 • Published 7 days ago • 37
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization Paper • 2411.10442 • Published 13 days ago • 61
Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions Paper • 2411.14405 • Published 7 days ago • 51