Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2407.14358

Papers I want to read

Papers in my to-read list

RLHF Workflow: From Reward Modeling to Online RLHF

Paper • 2405.07863 • Published May 13 • 67
Chameleon: Mixed-Modal Early-Fusion Foundation Models

Paper • 2405.09818 • Published May 16 • 126
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models

Paper • 2405.15574 • Published May 24 • 53
An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27 • 85

For Content Creator

Generative AI meets 3D: A Survey on Text-to-3D in AIGC Era

Paper • 2305.06131 • Published May 10, 2023 • 2
Perpetual Humanoid Control for Real-time Simulated Avatars

Paper • 2305.06456 • Published May 10, 2023 • 1
Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold

Paper • 2305.10973 • Published May 18, 2023 • 32
LDM3D: Latent Diffusion Model for 3D

Paper • 2305.10853 • Published May 18, 2023 • 10

generative audio

Taming Data and Transformers for Audio Generation

Paper • 2406.19388 • Published Jun 27
GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities

Paper • 2406.11768 • Published Jun 17 • 20
PicoAudio: Enabling Precise Timestamp and Frequency Controllability of Audio Events in Text-to-audio Generation

Paper • 2407.02869 • Published Jul 3 • 18
Stable Audio Open

Paper • 2407.14358 • Published Jul 19 • 23

Stable Audio Open

Paper • 2407.14358 • Published Jul 19 • 23

audio-model-use

stabilityai/stable-audio-open-1.0

Text-to-Audio • Updated Jul 31 • 28.9k • 949
Stable Audio Open

Paper • 2407.14358 • Published Jul 19 • 23
facebook/musicgen-small

Text-to-Audio • Updated Nov 17, 2023 • 53.5k • • 337

stabilityai/stable-audio-open-1.0

Text-to-Audio • Updated Jul 31 • 28.9k • 949
Stable Audio Open

Paper • 2407.14358 • Published Jul 19 • 23

Stable Audio Open

Paper • 2407.14358 • Published Jul 19 • 23
Qwen2-Audio Technical Report

Paper • 2407.10759 • Published Jul 15 • 55
kyutai/moshiko-pytorch-bf16

Updated Sep 18 • 41.2k • 148
Presto! Distilling Steps and Layers for Accelerating Music Generation

Paper • 2410.05167 • Published Oct 7 • 15

SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models

Paper • 2407.15841 • Published Jul 22 • 39
Stable Audio Open

Paper • 2407.14358 • Published Jul 19 • 23
PlacidDreamer: Advancing Harmony in Text-to-3D Generation

Paper • 2407.13976 • Published Jul 19 • 5
Efficient Audio Captioning with Encoder-Level Knowledge Distillation

Paper • 2407.14329 • Published Jul 19 • 4

Audio generation

Stable Audio Open

Paper • 2407.14358 • Published Jul 19 • 23

Stable Audio Open

Paper • 2407.14358 • Published Jul 19 • 23

Previous
1
2
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs