Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2312.13578

VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

Paper • 2404.10667 • Published Apr 16 • 17
CyberHost: Taming Audio-driven Avatar Diffusion Model with Region Codebook Attention

Paper • 2409.01876 • Published Sep 3 • 1
DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation

Paper • 2312.13578 • Published Dec 21, 2023 • 26
Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic Gaussians

Paper • 2312.03029 • Published Dec 5, 2023 • 23

Human Image/Video Generation

Collect latest human image/video generation papers

Sapiens: Foundation for Human Vision Models

Paper • 2408.12569 • Published Aug 22 • 88
ControlNeXt: Powerful and Efficient Control for Image and Video Generation

Paper • 2408.06070 • Published Aug 12 • 52
UniPortrait: A Unified Framework for Identity-Preserving Single- and Multi-Human Image Personalization

Paper • 2408.05939 • Published Aug 12 • 13
DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation

Paper • 2312.13578 • Published Dec 21, 2023 • 26

3D Avatar Utils

Media2Face: Co-speech Facial Animation Generation With Multi-Modality Guidance

Paper • 2401.15687 • Published Jan 28 • 21
Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic Gaussians

Paper • 2312.03029 • Published Dec 5, 2023 • 23
DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation

Paper • 2312.13578 • Published Dec 21, 2023 • 26
Splatter Image: Ultra-Fast Single-View 3D Reconstruction

Paper • 2312.13150 • Published Dec 20, 2023 • 14

DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation

Paper • 2312.13578 • Published Dec 21, 2023 • 26
UniRef++: Segment Every Reference Object in Spatial and Temporal Spaces

Paper • 2312.15715 • Published Dec 25, 2023 • 19
VidToMe: Video Token Merging for Zero-Shot Video Editing

Paper • 2312.10656 • Published Dec 17, 2023 • 10

Image to Image/ Video/ 3D

DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation

Paper • 2312.13578 • Published Dec 21, 2023 • 26
ShowRoom3D: Text to High-Quality 3D Room Generation Using 3D Priors

Paper • 2312.13324 • Published Dec 20, 2023 • 9
PICTURE: PhotorealistIC virtual Try-on from UnconstRained dEsigns

Paper • 2312.04534 • Published Dec 7, 2023 • 6
Image Sculpting: Precise Object Editing with 3D Geometry Control

Paper • 2401.01702 • Published Jan 2 • 18

DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation

Paper • 2312.13578 • Published Dec 21, 2023 • 26

talking-head-generation

DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation

Paper • 2312.13578 • Published Dec 21, 2023 • 26
Splatter Image: Ultra-Fast Single-View 3D Reconstruction

Paper • 2312.13150 • Published Dec 20, 2023 • 14
Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic Gaussians

Paper • 2312.03029 • Published Dec 5, 2023 • 23
Relightable Gaussian Codec Avatars

Paper • 2312.03704 • Published Dec 6, 2023 • 29

One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning

Paper • 2306.07967 • Published Jun 13, 2023 • 24
Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation

Paper • 2306.07954 • Published Jun 13, 2023 • 113
TryOnDiffusion: A Tale of Two UNets

Paper • 2306.08276 • Published Jun 14, 2023 • 72
Seeing the World through Your Eyes

Paper • 2306.09348 • Published Jun 15, 2023 • 32

Table-GPT: Table-tuned GPT for Diverse Table Tasks

Paper • 2310.09263 • Published Oct 13, 2023 • 39
A Zero-Shot Language Agent for Computer Control with Structured Reflection

Paper • 2310.08740 • Published Oct 12, 2023 • 14
The Consensus Game: Language Model Generation via Equilibrium Search

Paper • 2310.09139 • Published Oct 13, 2023 • 12
PaLI-3 Vision Language Models: Smaller, Faster, Stronger

Paper • 2310.09199 • Published Oct 13, 2023 • 24

Diffusion Model

InstructDiffusion: A Generalist Modeling Interface for Vision Tasks

Paper • 2309.03895 • Published Sep 7, 2023 • 13
ConceptGraphs: Open-Vocabulary 3D Scene Graphs for Perception and Planning

Paper • 2309.16650 • Published Sep 28, 2023 • 10
CCEdit: Creative and Controllable Video Editing via Diffusion Models

Paper • 2309.16496 • Published Sep 28, 2023 • 9
FreeNoise: Tuning-Free Longer Video Diffusion Via Noise Rescheduling

Paper • 2310.15169 • Published Oct 23, 2023 • 9

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs