IGOR: Image-GOal Representations are the Atomic Control Units for Foundation Models in Embodied AI Paper • 2411.00785 • Published Oct 17 • 8
Distributional Reinforcement Learning for Multi-Dimensional Reward Functions Paper • 2110.13578 • Published Oct 26, 2021 • 1
Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation Paper • 2410.05363 • Published Oct 7 • 44
ROCKET-1: Master Open-World Interaction with Visual-Temporal Context Prompting Paper • 2410.17856 • Published 29 days ago • 49
SceneScript: Reconstructing Scenes With An Autoregressive Structured Language Model Paper • 2403.13064 • Published Mar 19 • 31
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data Paper • 2401.10891 • Published Jan 19 • 59
SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities Paper • 2401.12168 • Published Jan 22 • 26
LLM in a flash: Efficient Large Language Model Inference with Limited Memory Paper • 2312.11514 • Published Dec 12, 2023 • 258
M3DBench: Let's Instruct Large Models with Multi-modal 3D Prompts Paper • 2312.10763 • Published Dec 17, 2023 • 18
Holodeck: Language Guided Generation of 3D Embodied AI Environments Paper • 2312.09067 • Published Dec 14, 2023 • 13