Text-to-Image
Diffusers
Safetensors
lcm-ssd-1b / README.md
pcuenq's picture
pcuenq HF staff
Update guidance scale
4f17473
|
raw
history blame
2.27 kB
metadata
library_name: diffusers
base_model: stabilityai/stable-diffusion-xl-base-1.0
tags:
  - text-to-image
license: openrail++
inference: false

Latent Consistency Model (LCM): SSD-1B

Latent Consistency Model (LCM) was proposed in Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference by Simian Luo, Yiqin Tan et al. and Simian Luo, Suraj Patil, and Daniel Gu succesfully applied the same approach to create LCM for SDXL.

This checkpoint is a LCM distilled version of segmind/SSD-1B that allows to reduce the number of inference steps to only between 2 - 8 steps.

Usage

LCM SDXL is supported in 🤗 Hugging Face Diffusers library from version v0.23.0 onwards. To run the model, first install the latest version of the Diffusers library as well as peft, accelerate and transformers. audio dataset from the Hugging Face Hub:

pip install --upgrade pip
pip install --upgrade diffusers transformers accelerate peft

Text-to-Image

The model can be loaded with it's base pipeline segmind/SSD-1B. Next, the scheduler needs to be changed to LCMScheduler and we can reduce the number of inference steps to just 2 to 8 steps.

from diffusers import UNet2DConditionModel, DiffusionPipeline, LCMScheduler
import torch

unet = UNet2DConditionModel.from_pretrained("latent-consistency/lcm-ssd-1b", torch_dtype=torch.float16, variant="fp16")
pipe = DiffusionPipeline.from_pretrained("segmind/SSD-1B", unet=unet, torch_dtype=torch.float16, variant="fp16")

pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)
pipe.to("cuda")

prompt = "a close-up picture of an old man standing in the rain"

image = pipe(prompt, num_inference_steps=4, guidance_scale=1.0).images[0]

Image-to-Image

Works as well! TODO docs

Inpainting

Works as well! TODO docs

ControlNet

Works as well! TODO docs

T2I Adapter

Works as well! TODO docs

Speed Benchmark

TODO

Training

TODO