NG

SirRa1zel

AI & ML interests

Text-to-Speech, Translation, Object Detection

Recent Activity

liked a Space 5 days ago

huggingface-projects/ai-video-composer

liked a model 5 days ago

OuteAI/OuteTTS-0.2-500M

liked a Space 12 days ago

opendatalab/MinerU

View all activity

Organizations

None yet

SirRa1zel's activity

upvoted a paper 26 days ago

High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching

Paper • 2407.03648 • Published Jul 4 • 16

upvoted a collection 26 days ago

MelodyFlow

Collection

MelodyFlow: High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching • 7 items • Updated Oct 23 • 16

upvoted a collection about 1 month ago

LayerSkip

Collection

Models continually pretrained using LayerSkip - https://arxiv.org/abs/2404.16710 • 8 items • Updated 14 days ago • 45

upvoted a paper about 2 months ago

Tutor CoPilot: A Human-AI Approach for Scaling Real-Time Expertise

Paper • 2410.03017 • Published Oct 3 • 25

upvoted a paper 2 months ago

Prithvi WxC: Foundation Model for Weather and Climate

Paper • 2409.13598 • Published Sep 20 • 38

upvoted 2 papers 3 months ago

Source2Synth: Synthetic Data Generation and Curation Grounded in Real Data Sources

Paper • 2409.08239 • Published Sep 12 • 16

Platypus: A Generalized Specialist Model for Reading Text in Various Forms

Paper • 2408.14805 • Published Aug 27 • 13

upvoted 6 papers 4 months ago

MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine

Paper • 2408.02900 • Published Aug 6 • 25

upvoted 7 papers 5 months ago

Tx-LLM: A Large Language Model for Therapeutics

Paper • 2406.06316 • Published Jun 10 • 15

Separating the "Chirp" from the "Chat": Self-supervised Visual Grounding of Sound and Language

Paper • 2406.05629 • Published Jun 9 • 7

FontStudio: Shape-Adaptive Diffusion Model for Coherent and Consistent Font Effect Generation

Paper • 2406.08392 • Published Jun 12 • 18

Make It Count: Text-to-Image Generation with an Accurate Number of Objects

Paper • 2406.10210 • Published Jun 14 • 76

DialSim: A Real-Time Simulator for Evaluating Long-Term Dialogue Understanding of Conversational Agents

Paper • 2406.13144 • Published Jun 19 • 11

MotionBooth: Motion-Aware Customized Text-to-Video Generation

Paper • 2406.17758 • Published Jun 25 • 18

ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation

Paper • 2406.18522 • Published Jun 26 • 44