V^3: Viewing Volumetric Videos on Mobiles via Streamable 2D Dynamic Gaussians Paper • 2409.13648 • Published Sep 20 • 9
LayerPano3D: Layered 3D Panorama for Hyper-Immersive Scene Generation Paper • 2408.13252 • Published Aug 23 • 23
LLM-AD: Large Language Model based Audio Description System Paper • 2405.00983 • Published May 2 • 16
PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation Paper • 2404.13026 • Published Apr 19 • 23
Binary Opacity Grids: Capturing Fine Geometric Detail for Mesh-Based View Synthesis Paper • 2402.12377 • Published Feb 19 • 8
Computing Power and the Governance of Artificial Intelligence Paper • 2402.08797 • Published Feb 13 • 11
StrokeNUWA: Tokenizing Strokes for Vector Graphic Synthesis Paper • 2401.17093 • Published Jan 30 • 18
DocLLM: A layout-aware generative language model for multimodal document understanding Paper • 2401.00908 • Published Dec 31, 2023 • 181
DreamDistribution: Prompt Distribution Learning for Text-to-Image Diffusion Models Paper • 2312.14216 • Published Dec 21, 2023 • 10
ControlRoom3D: Room Generation using Semantic Proxy Rooms Paper • 2312.05208 • Published Dec 8, 2023 • 8
Eureka: Human-Level Reward Design via Coding Large Language Models Paper • 2310.12931 • Published Oct 19, 2023 • 26
Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V Paper • 2310.11441 • Published Oct 17, 2023 • 26
Decaf: Monocular Deformation Capture for Face and Hand Interactions Paper • 2309.16670 • Published Sep 28, 2023 • 5
ControlMat: A Controlled Generative Approach to Material Capture Paper • 2309.01700 • Published Sep 4, 2023 • 13
VideoGen: A Reference-Guided Latent Diffusion Approach for High Definition Text-to-Video Generation Paper • 2309.00398 • Published Sep 1, 2023 • 20