High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching Paper • 2407.03648 • Published Jul 4 • 16
MelodyFlow Collection MelodyFlow: High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching • 7 items • Updated Oct 23 • 16
LayerSkip Collection Models continually pretrained using LayerSkip - https://arxiv.org/abs/2404.16710 • 8 items • Updated 14 days ago • 45
Tutor CoPilot: A Human-AI Approach for Scaling Real-Time Expertise Paper • 2410.03017 • Published Oct 3 • 25
Source2Synth: Synthetic Data Generation and Curation Grounded in Real Data Sources Paper • 2409.08239 • Published Sep 12 • 16
Platypus: A Generalized Specialist Model for Reading Text in Various Forms Paper • 2408.14805 • Published Aug 27 • 13
MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine Paper • 2408.02900 • Published Aug 6 • 25
Transformer Explainer: Interactive Learning of Text-Generative Models Paper • 2408.04619 • Published Aug 8 • 155
CoverBench: A Challenging Benchmark for Complex Claim Verification Paper • 2408.03325 • Published Aug 6 • 14
Tails Tell Tales: Chapter-Wide Manga Transcriptions with Character Names Paper • 2408.00298 • Published Aug 1 • 9
Separating the "Chirp" from the "Chat": Self-supervised Visual Grounding of Sound and Language Paper • 2406.05629 • Published Jun 9 • 7
FontStudio: Shape-Adaptive Diffusion Model for Coherent and Consistent Font Effect Generation Paper • 2406.08392 • Published Jun 12 • 18
Make It Count: Text-to-Image Generation with an Accurate Number of Objects Paper • 2406.10210 • Published Jun 14 • 76
DialSim: A Real-Time Simulator for Evaluating Long-Term Dialogue Understanding of Conversational Agents Paper • 2406.13144 • Published Jun 19 • 11
MotionBooth: Motion-Aware Customized Text-to-Video Generation Paper • 2406.17758 • Published Jun 25 • 18
ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation Paper • 2406.18522 • Published Jun 26 • 44