EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
*Equal Contribution.
Terminal Technology Department, Alipay, Ant Group.
Model Files
./pretrained_models/
βββ denoising_unet.pth
βββ reference_unet.pth
βββ motion_module.pth
βββ face_locator.pth
βββ sd-vae-ft-mse
β βββ ...
βββ sd-image-variations-diffusers
β βββ ...
βββ audio_processor
βββ whisper_tiny.pt
Some models in this hub can be directly downloaded from it's original hub:
- sd-vae-ft-mse: Weights are intended to be used with the diffusers library. (Thanks to stablilityai)
- sd-image-variations-diffusers
- audio_processor
Gallery
Audio Driven (Sing)
Audio Driven (English)
Audio Driven (Chinese)
Landmark Driven
Audio + Selected Landmark Driven
οΌSome demo images above are sourced from image websites. If there is any infringement, we will immediately remove them and apologize.οΌ
Citation
If you find our work useful for your research, please consider citing the paper:
@misc{chen2024echomimic,
title={EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning},
author={Zhiyuan Chen, Jiajiong Cao, Zhiquan Chen, Yuming Li, Chenguang Ma},
year={2024},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
- Downloads last month
- 0