multimodalart
/

sdxl_perturbed_attention_guidance

Unconditional Image Generation

Diffusion Models

Stable Diffusion

Perturbed-Attention Guidance

Model card Files Files and versions Community

sdxl_perturbed_attention_guidance / README.md

multimodalart's picture

multimodalart HF staff

Update README.md

6a63c98 verified 8 months ago

|

2.19 kB

	---
	language:
	- en
	pipeline_tag: unconditional-image-generation
	tags:
	- Diffusion Models
	- Stable Diffusion
	- Perturbed-Attention Guidance
	- PAG
	---

	# Perturbed-Attention Guidance for SDXL

	<div style="display:flex">
	<video loop>
	<source src="pag_sdxl.mp4" type="video/mp4">
	</video>
	<video loop>
	<source src="pag_uncond.mp4" type="video/mp4">
	</video>
	</div>


	[Project](https://ku-cvlab.github.io/Perturbed-Attention-Guidance/) / [arXiv](https://arxiv.org/abs/2403.17377) / [GitHub](https://github.com/KU-CVLAB/Perturbed-Attention-Guidance)

	This repository is based on [Diffusers](https://huggingface.co/docs/diffusers/index). The pipeline is a modification of StableDiffusionXLPipeline to add Perturbed-Attention Guidance (PAG).

	The original Perturbed-Attention Guidance for unconditional models and SD1.5 by [Hyoungwon Cho](https://huggingface.co/hyoungwoncho) is availiable at [hyoungwoncho/sd_perturbed_attention_guidance](https://huggingface.co/hyoungwoncho/sd_perturbed_attention_guidance)

	## Quickstart

	Loading Custom Pipeline:

	```py
	from diffusers import StableDiffusionXLPipeline

	pipe = StableDiffusionXLPipeline.from_pretrained(
	"stabilityai/stable-diffusion-xl-base-1.0",
	custom_pipeline="multimodalart/sdxl_perturbed_attention_guidance",
	torch_dtype=torch.float16
	)

	device="cuda"
	pipe = pipe.to(device)
	```

	Unconditional sampling with PAG:
	![image/jpeg](uncond_generation_pag.jpg)

	```py
	output = pipe(
	"",
	num_inference_steps=50,
	guidance_scale=0.0,
	pag_scale=5.0,
	pag_applied_layers=['mid']
	).images
	```

	Sampling with PAG and CFG:
	![image/jpeg](cfgpag.jpg)
	```py
	output = pipe(
	"the spirit of a tamagotchi wandering in the city of Vienna",
	num_inference_steps=25,
	guidance_scale=4.0,
	pag_scale=3.0,
	pag_applied_layers=['mid']
	).images
	```

	## Parameters

	`guidance_scale` : gudiance scale of CFG (ex: `7.5`)

	`pag_scale` : gudiance scale of PAG (ex: `4.0`)

	`pag_applied_layers`: layer to apply perturbation (ex: ['mid'])

	`pag_applied_layers_index` : index of the layers to apply perturbation (ex: ['m0', 'm1'])

	## Stable Diffusion XL Demo

	Soon