image - a zzfive Collection

Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

zzfive 's Collections

3d

image

LLMs

video

agent

cv

audio

robot

image

updated about 19 hours ago

Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis

Paper • 2401.09048 • Published Jan 17 • 9
Improving fine-grained understanding in image-text pre-training

Paper • 2401.09865 • Published Jan 18 • 16
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Paper • 2401.10891 • Published Jan 19 • 59
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild

Paper • 2401.13627 • Published Jan 24 • 73
UNIMO-G: Unified Image Generation through Multimodal Conditional Diffusion

Paper • 2401.13388 • Published Jan 24 • 11
DiffEditor: Boosting Accuracy and Flexibility on Diffusion-based Image Editing

Paper • 2402.02583 • Published Feb 4 • 7
SDXL-Lightning: Progressive Adversarial Diffusion Distillation

Paper • 2402.13929 • Published Feb 21 • 28
T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with Trajectory Stitching

Paper • 2402.14167 • Published Feb 21 • 10
Subobject-level Image Tokenization

Paper • 2402.14327 • Published Feb 22 • 17
Gen4Gen: Generative Data Pipeline for Generative Multi-Concept Composition

Paper • 2402.15504 • Published Feb 23 • 21
Multi-LoRA Composition for Image Generation

Paper • 2402.16843 • Published Feb 26 • 28
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions

Paper • 2402.17485 • Published Feb 27 • 189
DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models

Paper • 2402.19481 • Published Feb 29 • 20
Trajectory Consistency Distillation

Paper • 2402.19159 • Published Feb 29 • 14
RealCustom: Narrowing Real Text Word for Real-Time Open-Domain Text-to-Image Customization

Paper • 2403.00483 • Published Mar 1 • 12
ResAdapter: Domain Consistent Resolution Adapter for Diffusion Models

Paper • 2403.02084 • Published Mar 4 • 14
OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on

Paper • 2403.01779 • Published Mar 4 • 28
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Paper • 2403.03206 • Published Mar 5 • 59
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment

Paper • 2403.05135 • Published Mar 8 • 42
Motion Mamba: Efficient and Long Sequence Motion Generation with Hierarchical and Bidirectional Selective SSM

Paper • 2403.07487 • Published Mar 12 • 13
Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering

Paper • 2403.09622 • Published Mar 14 • 16
StreamMultiDiffusion: Real-Time Interactive Generation with Region-Based Semantic Control

Paper • 2403.09055 • Published Mar 14 • 24
IDAdapter: Learning Mixed Features for Tuning-Free Personalization of Text-to-Image Models

Paper • 2403.13535 • Published Mar 20 • 22
DepthFM: Fast Monocular Depth Estimation with Flow Matching

Paper • 2403.13788 • Published Mar 20 • 17
Magic Fixup: Streamlining Photo Editing by Watching Dynamic Videos

Paper • 2403.13044 • Published Mar 19 • 15
FlashFace: Human Image Personalization with High-fidelity Identity Preservation

Paper • 2403.17008 • Published Mar 25 • 19
SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions

Paper • 2403.16627 • Published Mar 25 • 20
ViTAR: Vision Transformer with Any Resolution

Paper • 2403.18361 • Published Mar 27 • 52
ObjectDrop: Bootstrapping Counterfactuals for Photorealistic Object Removal and Insertion

Paper • 2403.18818 • Published Mar 27 • 25
CosmicMan: A Text-to-Image Foundation Model for Humans

Paper • 2404.01294 • Published Apr 1 • 15
Condition-Aware Neural Network for Controlled Image Generation

Paper • 2404.01143 • Published Apr 1 • 11
Measuring Style Similarity in Diffusion Models

Paper • 2404.01292 • Published Apr 1 • 16
CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching

Paper • 2404.03653 • Published Apr 4 • 33
RL for Consistency Models: Faster Reward Guided Text-to-Image Generation

Paper • 2404.03673 • Published Mar 25 • 14
ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback

Paper • 2404.07987 • Published Apr 11 • 47
Scaling (Down) CLIP: A Comprehensive Analysis of Data, Architecture, and Training Strategies

Paper • 2404.08197 • Published Apr 12 • 27
Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model

Paper • 2404.09967 • Published Apr 15 • 20
HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing

Paper • 2404.09990 • Published Apr 15 • 12
Dynamic Typography: Bringing Words to Life

Paper • 2404.11614 • Published Apr 17 • 44
MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation

Paper • 2404.11565 • Published Apr 17 • 14
Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis

Paper • 2404.13686 • Published Apr 21 • 27
Align Your Steps: Optimizing Sampling Schedules in Diffusion Models

Paper • 2404.14507 • Published Apr 22 • 21
PuLID: Pure and Lightning ID Customization via Contrastive Alignment

Paper • 2404.16022 • Published Apr 24 • 20
Editable Image Elements for Controllable Synthesis

Paper • 2404.16029 • Published Apr 24 • 10
ID-Aligner: Enhancing Identity-Preserving Text-to-Image Generation with Reward Feedback Learning

Paper • 2404.15449 • Published Apr 23 • 11
ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving

Paper • 2404.16771 • Published Apr 25 • 16
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation

Paper • 2405.01434 • Published May 2 • 52
Customizing Text-to-Image Models with a Single Image Pair

Paper • 2405.01536 • Published May 2 • 18
Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control

Paper • 2405.12970 • Published May 21 • 22
RectifID: Personalizing Rectified Flow with Anchored Classifier Guidance

Paper • 2405.14677 • Published May 23 • 9
DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis

Paper • 2405.14224 • Published May 23 • 12
Semantica: An Adaptable Image-Conditioned Diffusion Model

Paper • 2405.14857 • Published May 23 • 8
EM Distillation for One-step Diffusion Models

Paper • 2405.16852 • Published May 27 • 10
Greedy Growing Enables High-Resolution Pixel-Based Diffusion Models

Paper • 2405.16759 • Published May 27 • 7
Phased Consistency Model

Paper • 2405.18407 • Published May 28 • 46
BitsFusion: 1.99 bits Weight Quantization of Diffusion Model

Paper • 2406.04333 • Published Jun 6 • 36
pOps: Photo-Inspired Diffusion Operators

Paper • 2406.01300 • Published Jun 3 • 16
Zero-shot Image Editing with Reference Imitation

Paper • 2406.07547 • Published Jun 11 • 31
An Image is Worth 32 Tokens for Reconstruction and Generation

Paper • 2406.07550 • Published Jun 11 • 55
AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising

Paper • 2406.06911 • Published Jun 11 • 10
FontStudio: Shape-Adaptive Diffusion Model for Coherent and Consistent Font Effect Generation

Paper • 2406.08392 • Published Jun 12 • 18
Depth Anything V2

Paper • 2406.09414 • Published Jun 13 • 92
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels

Paper • 2406.09415 • Published Jun 13 • 50
Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models

Paper • 2406.09416 • Published Jun 13 • 28
EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts

Paper • 2406.09162 • Published Jun 13 • 13
Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Multilingual Visual Text Rendering

Paper • 2406.10208 • Published Jun 14 • 21
Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models

Paper • 2406.11831 • Published Jun 17 • 20
The Devil is in the Details: StyleFeatureEditor for Detail-Rich StyleGAN Inversion and High Quality Image Editing

Paper • 2406.10601 • Published Jun 15 • 65
Invertible Consistency Distillation for Text-Guided Image Editing in Around 7 Steps

Paper • 2406.14539 • Published Jun 20 • 26
DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation

Paper • 2406.16855 • Published Jun 24 • 54
Aligning Diffusion Models with Noise-Conditioned Perception

Paper • 2406.17636 • Published Jun 25 • 26
Magic Insert: Style-Aware Drag-and-Drop

Paper • 2407.02489 • Published Jul 2 • 20
DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents

Paper • 2407.03300 • Published Jul 3 • 11
PartCraft: Crafting Creative Objects by Parts

Paper • 2407.04604 • Published Jul 5 • 4
SVGCraft: Beyond Single Object Text-to-SVG Synthesis with Comprehensive Canvas Layout

Paper • 2404.00412 • Published Mar 30 • 2
DataDream: Few-shot Guided Dataset Generation

Paper • 2407.10910 • Published Jul 15 • 8
Scaling Diffusion Transformers to 16 Billion Parameters

Paper • 2407.11633 • Published Jul 16 • 25
IMAGDressing-v1: Customizable Virtual Dressing

Paper • 2407.12705 • Published Jul 17 • 12
CGB-DM: Content and Graphic Balance Layout Generation with Transformer-based Diffusion Model

Paper • 2407.15233 • Published Jul 21 • 6
Artist: Aesthetically Controllable Text-Driven Stylization without Training

Paper • 2407.15842 • Published Jul 22 • 13
Discrete Flow Matching

Paper • 2407.15595 • Published Jul 22 • 11
ViPer: Visual Personalization of Generative Models via Individual Preference Learning

Paper • 2407.17365 • Published Jul 24 • 11
Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model

Paper • 2407.16982 • Published Jul 24 • 40
BetterDepth: Plug-and-Play Diffusion Refiner for Zero-Shot Monocular Depth Estimation

Paper • 2407.17952 • Published Jul 25 • 29
SHIC: Shape-Image Correspondences with no Keypoint Supervision

Paper • 2407.18907 • Published Jul 26 • 40
TurboEdit: Text-Based Image Editing Using Few-Step Diffusion Models

Paper • 2408.00735 • Published Aug 1 • 15
Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention

Paper • 2408.00760 • Published Aug 1 • 6
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining

Paper • 2408.02657 • Published Aug 5 • 32
ProCreate, Dont Reproduce! Propulsive Energy Diffusion for Creative Generation

Paper • 2408.02226 • Published Aug 5 • 10
IPAdapter-Instruct: Resolving Ambiguity in Image-based Conditioning using Instruct Prompts

Paper • 2408.03209 • Published Aug 6 • 21
Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling

Paper • 2408.03695 • Published Aug 7 • 12
ControlNeXt: Powerful and Efficient Control for Image and Video Generation

Paper • 2408.06070 • Published Aug 12 • 52
BRAT: Bonus oRthogonAl Token for Architecture Agnostic Textual Inversion

Paper • 2408.04785 • Published Aug 8 • 6
UniPortrait: A Unified Framework for Identity-Preserving Single- and Multi-Human Image Personalization

Paper • 2408.05939 • Published Aug 12 • 13
Imagen 3

Paper • 2408.07009 • Published Aug 13 • 61
ZePo: Zero-Shot Portrait Stylization with Faster Sampling

Paper • 2408.05492 • Published Aug 10 • 7
Generative Photomontage

Paper • 2408.07116 • Published Aug 13 • 19
JPEG-LM: LLMs as Image Generators with Canonical Codec Representations

Paper • 2408.08459 • Published Aug 15 • 44
TurboEdit: Instant text-based image editing

Paper • 2408.08332 • Published Aug 14 • 19
Photorealistic Object Insertion with Diffusion-Guided Inverse Rendering

Paper • 2408.09702 • Published Aug 19 • 9
TraDiffusion: Trajectory-Based Training-Free Image Generation

Paper • 2408.09739 • Published Aug 19 • 8
MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further Tuning

Paper • 2408.11001 • Published Aug 20 • 11
The Brittleness of AI-Generated Image Watermarking Techniques: Examining Their Robustness Against Visual Paraphrasing Attacks

Paper • 2408.10446 • Published Aug 19 • 6
Scalable Autoregressive Image Generation with Mamba

Paper • 2408.12245 • Published Aug 22 • 25
CODE: Confident Ordinary Differential Editing

Paper • 2408.12418 • Published Aug 22 • 4
SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher

Paper • 2408.14176 • Published Aug 26 • 60
Build-A-Scene: Interactive 3D Layout Control for Diffusion-Based Image Generation

Paper • 2408.14819 • Published Aug 27 • 20
Distribution Backtracking Builds A Faster Convergence Trajectory for One-step Diffusion Distillation

Paper • 2408.15991 • Published Aug 28 • 15
CSGO: Content-Style Composition in Text-to-Image Generation

Paper • 2408.16766 • Published Aug 29 • 17
CoRe: Context-Regularized Text Embedding Learning for Text-to-Image Personalization

Paper • 2408.15914 • Published Aug 28 • 21
VQ4DiT: Efficient Post-Training Vector Quantization for Diffusion Transformers

Paper • 2408.17131 • Published Aug 30 • 11
LinFusion: 1 GPU, 1 Minute, 16K Image

Paper • 2409.02097 • Published Sep 3 • 32
Accurate Compression of Text-to-Image Diffusion Models via Vector Quantization

Paper • 2409.00492 • Published Aug 31 • 11
Guide-and-Rescale: Self-Guidance Mechanism for Effective Tuning-Free Real Image Editing

Paper • 2409.01322 • Published Sep 2 • 94
IFAdapter: Instance Feature Control for Grounded Text-to-Image Generation

Paper • 2409.08240 • Published Sep 12 • 18
InstantDrag: Improving Interactivity in Drag-based Image Editing

Paper • 2409.08857 • Published Sep 13 • 30
StoryMaker: Towards Holistic Consistent Characters in Text-to-image Generation

Paper • 2409.12576 • Published Sep 19 • 15
Imagine yourself: Tuning-Free Personalized Image Generation

Paper • 2409.13346 • Published Sep 20 • 67
Colorful Diffuse Intrinsic Image Decomposition in the Wild

Paper • 2409.13690 • Published Sep 20 • 12
Improvements to SDXL in NovelAI Diffusion V3

Paper • 2409.15997 • Published Sep 24 • 11
Pixel-Space Post-Training of Latent Diffusion Models

Paper • 2409.17565 • Published Sep 26 • 19
OmniBooth: Learning Latent Control for Image Synthesis with Multi-modal Instruction

Paper • 2410.04932 • Published Oct 7 • 9
Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding

Paper • 2410.01699 • Published Oct 2 • 18
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation

Paper • 2410.07171 • Published Oct 9 • 41
Story-Adapter: A Training-free Iterative Framework for Long Story Visualization

Paper • 2410.06244 • Published Oct 8 • 19
Eliminating Oversaturation and Artifacts of High Guidance Scales in Diffusion Models

Paper • 2410.02416 • Published Oct 3 • 25
DICE: Discrete Inversion Enabling Controllable Editing for Multinomial Diffusion and Masked Generative Models

Paper • 2410.08207 • Published Oct 10 • 18
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis

Paper • 2410.08261 • Published Oct 10 • 49
EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models

Paper • 2410.07133 • Published Oct 9 • 18
Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations

Paper • 2410.10792 • Published Oct 14 • 26
Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices

Paper • 2410.11795 • Published Oct 15 • 16
Improving Long-Text Alignment for Text-to-Image Diffusion Models

Paper • 2410.11817 • Published Oct 15 • 14
Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens

Paper • 2410.13863 • Published Oct 17 • 35
VidPanos: Generative Panoramic Videos from Casual Panning Videos

Paper • 2410.13832 • Published Oct 17 • 12
FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion Model

Paper • 2410.13925 • Published Oct 17 • 22
BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities

Paper • 2410.14672 • Published Oct 18 • 7
Scalable Ranked Preference Optimization for Text-to-Image Generation

Paper • 2410.18013 • Published 29 days ago • 14
Stable Consistency Tuning: Understanding and Improving Consistency Models

Paper • 2410.18958 • Published 28 days ago • 9
DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation

Paper • 2410.18666 • Published 28 days ago • 18
Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders

Paper • 2410.22366 • Published 24 days ago • 73
Constant Acceleration Flow

Paper • 2411.00322 • Published 21 days ago • 22
In-Context LoRA for Diffusion Transformers

Paper • 2410.23775 • Published 21 days ago • 10
Training-free Regional Prompting for Diffusion Transformers

Paper • 2411.02395 • Published 17 days ago • 23
Constrained Diffusion Implicit Models

Paper • 2411.00359 • Published 21 days ago • 5
SVDQunat: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models

Paper • 2411.05007 • Published 14 days ago • 16
Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models

Paper • 2411.07232 • Published 10 days ago • 60
OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision

Paper • 2411.07199 • Published 10 days ago • 42
Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models

Paper • 2411.07126 • Published 10 days ago • 28
Watermark Anything with Localized Messages

Paper • 2411.07231 • Published 10 days ago • 19
JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation

Paper • 2411.07975 • Published 9 days ago • 24
Scaling Properties of Diffusion Models for Perceptual Tasks

Paper • 2411.08034 • Published 9 days ago • 13
MagicQuill: An Intelligent Interactive Image Editing System

Paper • 2411.09703 • Published 7 days ago • 50
Inconsistencies In Consistency Models: Better ODE Solving Does Not Imply Better Samples

Paper • 2411.08954 • Published 8 days ago • 5
Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement

Paper • 2411.06558 • Published 11 days ago • 29
FitDiT: Advancing the Authentic Garment Details for High-fidelity Virtual Try-on

Paper • 2411.10499 • Published 6 days ago • 9
Continuous Speculative Decoding for Autoregressive Image Generation

Paper • 2411.11925 • Published 3 days ago • 13

Collection guide
Browse collections

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs