Content-Style Composition (GoGoGo)
diffusion-based Image Restoration model
Generate Talking avatars from Text-to-Speech
Audio-Driven Portrait Animations
Co-Speech Gesture 3D Motion Generation
F5-TTS & E2-TTS: Zero-Shot Voice Cloning (Unofficial Demo)
Import a portrait, click to move the head!