Xuenan Xu's picture

1 1

Xuenan Xu

wsntxxn

·

https://wsntxxn.github.io

AI & ML interests

Text to Speech Synthesis Text to Music Synthesis Singing Voice Synthesis

Organizations

None yet

Papers 10

arxiv:2407.14329

arxiv:2407.02869

arxiv:2407.02857

arxiv:2406.08052

spaces 2

MM StoryAgent

Efficient Audio Captioning

models 7

wsntxxn/cnn14rnn-tempgru-audiocaps-captioning

Feature Extraction • Updated Aug 19 • 4 • 1

wsntxxn/effb2-trm-audiocaps-captioning

Feature Extraction • Updated Aug 19 • 56 • 1

wsntxxn/effb2-trm-clotho-captioning

Feature Extraction • Updated Aug 19 • 78 • 1

wsntxxn/cnn8rnn-w2vmean-audiocaps-grounding

Audio Classification • Updated Aug 19 • 185 • 2

wsntxxn/cnn8rnn-audioset-sed

Audio Classification • Updated Aug 13 • 547 • 2

wsntxxn/audiocaps-simple-tokenizer

wsntxxn/clotho-simple-tokenizer

datasets

None public yet