# DragNUWA **DragNUWA** enables users to manipulate backgrounds or objects within images directly, and the model seamlessly translates these actions into **camera movements** or **object motions**, generating the corresponding video. See our paper: [DragNUWA: Fine-grained Control in Video Generation by Integrating Text, Image, and Trajectory](https://arxiv.org/abs/2308.08089) ### DragNUWA 1.5 (Updated on Jan 8, 2024) **DragNUWA 1.5** enables Stable Video Diffusion to animate an image according to specific path.
### DragNUWA 1.0 (Original Paper) [**DragNUWA 1.0**](https://arxiv.org/abs/2308.08089) utilizes text, images, and trajectory as three essential control factors to facilitate highly controllable video generation from semantic, spatial, and temporal aspects.
## Getting Start ### Setting Environment ```Shell git clone -b svd https://github.com/ProjectNUWA/DragNUWA.git cd DragNUWA conda create -n DragNUWA python=3.8 conda activate DragNUWA pip install -r environment.txt ``` ### Download Pretrained Weights Download the [Pretrained Weights](https://drive.google.com/file/d/1Z4JOley0SJCb35kFF4PCc6N6P1ftfX4i/view) to `models/` directory or directly run `bash models/Download.sh`. ### Drag and Animate ! ```Shell python DragNUWA_demo.py ``` It will launch a gradio demo, and you can drag an image and animate it! ### Acknowledgement We appreciate the open source of the following projects: [Stable Video Diffusion](https://github.com/Stability-AI/generative-models) [Hugging Face](https://github.com/huggingface) [UniMatch](https://github.com/autonomousvision/unimatch) ### Citation ```bibtex @article{yin2023dragnuwa, title={Dragnuwa: Fine-grained control in video generation by integrating text, image, and trajectory}, author={Yin, Shengming and Wu, Chenfei and Liang, Jian and Shi, Jie and Li, Houqiang and Ming, Gong and Duan, Nan}, journal={arXiv preprint arXiv:2308.08089}, year={2023} } ```