Futga: Towards Fine-grained Music Understanding through Temporally-enhanced Generative Augmentation Paper • 2407.20445 • Published Jul 29 • 20 • 3
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions Paper • 2402.17485 • Published Feb 27 • 189 • 20
The Chosen One: Consistent Characters in Text-to-Image Diffusion Models Paper • 2311.10093 • Published Nov 16, 2023 • 57 • 7
Progressive3D: Progressively Local Editing for Text-to-3D Content Creation with Complex Semantic Prompts Paper • 2310.11784 • Published Oct 18, 2023 • 10 • 2
AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model Paper • 2309.16058 • Published Sep 27, 2023 • 55 • 7