Motion Prompting: Controlling Video Generation with Motion Trajectories Paper • 2412.02700 • Published 11 days ago • 12
Truth or Mirage? Towards End-to-End Factuality Evaluation with LLM-OASIS Paper • 2411.19655 • Published 15 days ago • 18
OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation Paper • 2412.02592 • Published 11 days ago • 18
AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning Paper • 2412.03248 • Published 10 days ago • 25
VisionZip: Longer is Better but Not Necessary in Vision Language Models Paper • 2412.04467 • Published 9 days ago • 97
NitroFusion: High-Fidelity Single-Step Diffusion through Dynamic Adversarial Training Paper • 2412.02030 • Published 12 days ago • 17
MIDI: Multi-Instance Diffusion for Single Image to 3D Scene Generation Paper • 2412.03558 • Published 10 days ago • 14
TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation Paper • 2412.03069 • Published 11 days ago • 29
Distilling Diffusion Models to Efficient 3D LiDAR Scene Completion Paper • 2412.03515 • Published 10 days ago • 25
Imagine360: Immersive 360 Video Generation from Perspective Anchor Paper • 2412.03552 • Published 10 days ago • 26
PaliGemma 2: A Family of Versatile VLMs for Transfer Paper • 2412.03555 • Published 10 days ago • 111
LongIns: A Challenging Long-context Instruction-based Exam for LLMs Paper • 2406.17588 • Published Jun 25 • 22
InstaGen: Enhancing Object Detection by Training on Synthetic Dataset Paper • 2402.05937 • Published Feb 8 • 12