Octo-planner: On-device Language Model for Planner-Action Agents Paper • 2406.18082 • Published Jun 26 • 47
TroL: Traversal of Layers for Large Language and Vision Models Paper • 2406.12246 • Published Jun 18 • 34
Multimodal Needle in a Haystack: Benchmarking Long-Context Capability of Multimodal Large Language Models Paper • 2406.11230 • Published Jun 17 • 34
Instruction Pre-Training: Language Models are Supervised Multitask Learners Paper • 2406.14491 • Published Jun 20 • 85
Stylebreeder: Exploring and Democratizing Artistic Styles through Text-to-Image Models Paper • 2406.14599 • Published Jun 20 • 16
LogoMotion: Visually Grounded Code Generation for Content-Aware Animation Paper • 2405.07065 • Published May 11 • 16
BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation Paper • 2405.09546 • Published May 15 • 9