multimodalart's picture
Upload folder using huggingface_hub
135ce40 verified

Ctrl-X: Controlling Structure and Appearance for Text-To-Image Generation Without Guidance (NeurIPS 2024)

GitHub

Kuan Heng Lin1*, Sicheng Mo1*, Ben Klingher1, Fangzhou Mu2, Bolei Zhou1
1UCLA 2NVIDIA
*Equal contribution

Ctrl-X teaser figure

Getting started

Environment setup

Our code is built on top of diffusers v0.28.0. To set up the environment, please run the following.

conda env create -f environment.yaml
conda activate ctrlx

Gradio demo

We provide a user interface for testing our method. Running the following command starts the demo.

python3 app_ctrlx.py

Have fun playing around! :D

Contact

For any questions, thoughts, discussions, and any other things you want to reach out for, please contact Kuan Heng (Jordan) Lin ([email protected]).

Reference

If you use our code in your research, please cite the following work.

@inproceedings{lin2024ctrlx,
    author = {Lin, {Kuan Heng} and Mo, Sicheng and Klingher, Ben and Mu, Fangzhou and Zhou, Bolei},
    booktitle = {Advances in Neural Information Processing Systems},
    title = {Ctrl-X: Controlling Structure and Appearance for Text-To-Image Generation Without Guidance},
    year = {2024}
}