TexFusion: Synthesizing 3D Textures with Text-Guided Image Diffusion Models
Abstract
We present TexFusion (Texture Diffusion), a new method to synthesize textures for given 3D geometries, using large-scale text-guided image diffusion models. In contrast to recent works that leverage 2D text-to-image diffusion models to distill 3D objects using a slow and fragile optimization process, TexFusion introduces a new 3D-consistent generation technique specifically designed for texture synthesis that employs regular diffusion model sampling on different 2D rendered views. Specifically, we leverage latent diffusion models, apply the diffusion model's denoiser on a set of 2D renders of the 3D object, and aggregate the different denoising predictions on a shared latent texture map. Final output RGB textures are produced by optimizing an intermediate neural color field on the decodings of 2D renders of the latent texture. We thoroughly validate TexFusion and show that we can efficiently generate diverse, high quality and globally coherent textures. We achieve state-of-the-art text-guided texture synthesis performance using only image diffusion models, while avoiding the pitfalls of previous distillation-based methods. The text-conditioning offers detailed control and we also do not rely on any ground truth 3D textures for training. This makes our method versatile and applicable to a broad range of geometry and texture types. We hope that TexFusion will advance AI-based texturing of 3D assets for applications in virtual reality, game design, simulation, and more.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- PaintHuman: Towards High-fidelity Text-to-3D Human Texturing via Denoised Score Distillation (2023)
- DreamSpace: Dreaming Your Room Space with Text-Driven Panoramic Texture Propagation (2023)
- Breathing New Life into 3D Assets with Generative Repainting (2023)
- HumanNorm: Learning Normal Diffusion Model for High-quality and Realistic 3D Human Generation (2023)
- HiFi-123: Towards High-fidelity One Image to 3D Content Generation (2023)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- PaintHuman: Towards High-fidelity Text-to-3D Human Texturing via Denoised Score Distillation (2023)
- DreamSpace: Dreaming Your Room Space with Text-Driven Panoramic Texture Propagation (2023)
- Breathing New Life into 3D Assets with Generative Repainting (2023)
- HumanNorm: Learning Normal Diffusion Model for High-quality and Realistic 3D Human Generation (2023)
- HiFi-123: Towards High-fidelity One Image to 3D Content Generation (2023)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper