Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention
Abstract
Conditional diffusion models have shown remarkable success in visual content generation, producing high-quality samples across various domains, largely due to classifier-free guidance (CFG). Recent attempts to extend guidance to unconditional models have relied on heuristic techniques, resulting in suboptimal generation quality and unintended effects. In this work, we propose Smoothed Energy Guidance (SEG), a novel training- and condition-free approach that leverages the energy-based perspective of the self-attention mechanism to enhance image generation. By defining the energy of self-attention, we introduce a method to reduce the curvature of the energy landscape of attention and use the output as the unconditional prediction. Practically, we control the curvature of the energy landscape by adjusting the Gaussian kernel parameter while keeping the guidance scale parameter fixed. Additionally, we present a query blurring method that is equivalent to blurring the entire attention weights without incurring quadratic complexity in the number of tokens. In our experiments, SEG achieves a Pareto improvement in both quality and the reduction of side effects. The code is available at https://github.com/SusungHong/SEG-SDXL.
Community
Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention
SEG is a training- and condition-free approach that leverages the energy-based perspective of the self-attention mechanism to enhance image generation. It outperforms prior works without significant side effects.
Paper: https://arxiv.org/abs/2408.00760
Code: https://github.com/SusungHong/SEG-SDXL
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- No Training, No Problem: Rethinking Classifier-Free Guidance for Diffusion Models (2024)
- CFG++: Manifold-constrained Classifier Free Guidance for Diffusion Models (2024)
- Plug-and-Play Diffusion Distillation (2024)
- Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models (2024)
- GeoGuide: Geometric guidance of diffusion models (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 1
Datasets citing this paper 0
No dataset linking this paper