RecurrentGemma: Moving Past Transformers for Efficient Open Language Models Paper • 2404.07839 • Published Apr 11 • 41
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences Paper • 2404.03715 • Published Apr 4 • 60
MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation Paper • 2404.05674 • Published Apr 8 • 13
Agentless: Demystifying LLM-based Software Engineering Agents Paper • 2407.01489 • Published Jul 1 • 42
VIMI: Grounding Video Generation through Multi-modal Instruction Paper • 2407.06304 • Published Jul 8 • 9
MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions Paper • 2407.06358 • Published Jul 8 • 18
RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models Paper • 2407.06938 • Published Jul 9 • 21
Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence Paper • 2407.07061 • Published Jul 9 • 26
Medical SAM 2: Segment medical images as video via Segment Anything Model 2 Paper • 2408.00874 • Published Aug 1 • 41
VGGHeads: A Large-Scale Synthetic Dataset for 3D Human Heads Paper • 2407.18245 • Published Jul 25 • 7
Learning Task Decomposition to Assist Humans in Competitive Programming Paper • 2406.04604 • Published Jun 7 • 4