Geometry Image Diffusion: Fast and Data-Efficient Text-to-3D with Image-Based Surface Representation
Abstract
Generating high-quality 3D objects from textual descriptions remains a challenging problem due to computational cost, the scarcity of 3D data, and complex 3D representations. We introduce Geometry Image Diffusion (GIMDiffusion), a novel Text-to-3D model that utilizes geometry images to efficiently represent 3D shapes using 2D images, thereby avoiding the need for complex 3D-aware architectures. By integrating a Collaborative Control mechanism, we exploit the rich 2D priors of existing Text-to-Image models such as Stable Diffusion. This enables strong generalization even with limited 3D training data (allowing us to use only high-quality training data) as well as retaining compatibility with guidance techniques such as IPAdapter. In short, GIMDiffusion enables the generation of 3D assets at speeds comparable to current Text-to-Image models. The generated objects consist of semantically meaningful, separate parts and include internal structures, enhancing both usability and versatility.
Community
We're proud to announce our new paper on Geometry Image Diffusion, adapting existing image diffusion models to generate textured 3D models using collaborative control and geometry images.
Project page here: https://unity-research.github.io/Geometry-Image-Diffusion.github.io/
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Deep Geometric Moments Promote Shape Consistency in Text-to-3D Generation (2024)
- Text-to-Image Generation Via Energy-Based CLIP (2024)
- Scene123: One Prompt to 3D Scene Generation via Video-Assisted and Consistency-Enhanced MAE (2024)
- Large Point-to-Gaussian Model for Image-to-3D Generation (2024)
- Learning Part-aware 3D Representations by Fusing 2D Gaussians and Superquadrics (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
It is similar to https://huggingface.co/papers/2408.03178 - Object Images
Seems a robust general approach!
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper