Collections
Discover the best community collections!
Collections including paper arxiv:2408.03588
-
Futga: Towards Fine-grained Music Understanding through Temporally-enhanced Generative Augmentation
Paper • 2407.20445 • Published • 20 -
LP-MusicCaps: LLM-Based Pseudo Music Captioning
Paper • 2307.16372 • Published • 37 -
The Song Describer Dataset: a Corpus of Audio Captions for Music-and-Language Evaluation
Paper • 2311.10057 • Published • 1 -
MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models
Paper • 2408.01337 • Published • 10
-
SoundCTM: Uniting Score-based and Consistency Models for Text-to-Sound Generation
Paper • 2405.18503 • Published • 9 -
DITTO-2: Distilled Diffusion Inference-Time T-Optimization for Music Generation
Paper • 2405.20289 • Published • 10 -
LiveSpeech: Low-Latency Zero-shot Text-to-Speech via Autoregressive Modeling of Audio Discrete Codes
Paper • 2406.02897 • Published • 13 -
Audio Mamba: Bidirectional State Space Model for Audio Representation Learning
Paper • 2406.03344 • Published • 18
-
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Paper • 2310.00704 • Published • 19 -
Structural Similarities Between Language Models and Neural Response Measurements
Paper • 2306.01930 • Published • 2 -
Streaming Transformer ASR with Blockwise Synchronous Beam Search
Paper • 2006.14941 • Published • 2 -
NU-GAN: High resolution neural upsampling with GAN
Paper • 2010.11362 • Published • 2
-
A Novel 1D State Space for Efficient Music Rhythmic Analysis
Paper • 2111.00704 • Published -
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Paper • 2312.09911 • Published • 53 -
Music Style Transfer with Time-Varying Inversion of Diffusion Models
Paper • 2402.13763 • Published • 9 -
ChatMusician: Understanding and Generating Music Intrinsically with LLM
Paper • 2402.16153 • Published • 56