Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization Paper • 2404.09956 • Published Apr 15 • 11
Joint Audio and Symbolic Conditioning for Temporally Controlled Text-to-Music Generation Paper • 2406.10970 • Published Jun 16 • 1
EzAudio: Enhancing Text-to-Audio Generation with Efficient Diffusion Transformer Paper • 2409.10819 • Published Sep 17 • 17
Seed-Music: A Unified Framework for High Quality and Controlled Music Generation Paper • 2409.09214 • Published Sep 13 • 47
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions Paper • 2409.18042 • Published Sep 26 • 36
High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching Paper • 2407.03648 • Published Jul 4 • 16