Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines Paper • 2410.21220 • Published 4 days ago • 8
LongReward: Improving Long-context Large Language Models with AI Feedback Paper • 2410.21252 • Published 4 days ago • 16
GrounDiT: Grounding Diffusion Transformers via Noisy Patch Transplantation Paper • 2410.20474 • Published 5 days ago • 13