Wonder3D

Single Image to 3D using Cross-Domain Diffusion

Paper | Project page

Wonder3D reconstructs highly-detailed textured meshes from a single-view image in only 2 ∼ 3 minutes. Wonder3D first generates consistent multi-view normal maps with corresponding color images via a cross-domain diffusion model, and then leverages a novel normal fusion method to achieve fast and high-quality reconstruction.

Schedule

Inference code and pretrained models.
Huggingface demo.
Training code.
Rendering code for data prepare.

Preparation for inference

Install packages in requirements.txt.

conda create -n wonder3d
conda activate wonder3d
pip install -r requirements.txt

Download the checkpoints into the root folder.

Inference

Make sure you have the following models.

Wonder3D
|-- ckpts
    |-- unet
    |-- scheduler.bin
    ...

Predict foreground mask as the alpha channel. We use Clipdrop to segment the foreground object interactively. You may also use rembg to remove the backgrounds.

# !pip install rembg
import rembg
result = rembg.remove(result)
result.show()

Run Wonder3d to produce multiview-consistent normal maps and color images. Then you can check the results in the folder ./outputs. (we use rembg to remove backgrounds of the results, but the segmemtations are not always perfect.)

accelerate launch --config_file 1gpu.yaml test_mvdiffusion_seq.py \
            --config mvdiffusion-joint-ortho-6views.yaml

bash run_test.sh

Mesh Extraction

cd ./instant-nsr-pl
bash run.sh output_folder_path scene_name

Citation

If you find this repository useful in your project, please cite the following work. :)

@misc{long2023wonder3d,
      title={Wonder3D: Single Image to 3D using Cross-Domain Diffusion}, 
      author={Xiaoxiao Long and Yuan-Chen Guo and Cheng Lin and Yuan Liu and Zhiyang Dou and Lingjie Liu and Yuexin Ma and Song-Hai Zhang and Marc Habermann and Christian Theobalt and Wenping Wang},
      year={2023},
      eprint={2310.15008},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}