How to re-use latent state as a new seed?
There are scripts for interpolating between different seeds to explore the latent space. Using a slerp function between two seeds is interesting, but I wonder if one can re-seed with the old latent space.
I tried to return cond_latents
(without the 1/0.18215
factor) from the diffuse
method and then slerp between a normal distribution (like init2
) and the returned cont_latents
to only add a bit of noise to the current latent state in the hope to find solutions near the previous state. But when I try to slerp or linearly interpolate between a new random seed and the latent space, I don't get usable results.
I especially have problems how to normalize the tensor as using it naively results in diverging values and images getting brighter and brighter. Substracting the mean results in too dim images.
Let's say you want to interpolate between two prompts, with each resulting image "rooted" in the output of the first prompt.
This is what I've found to work for me:
- Your noise at each interpolation step will be the output of
slerp
. - That noise is added at timestep 0 to the encoded "root" image (using scheduler
add_noise
function) - Start diffusing
That's kind of what the script does. It just interpolates two seeds and then diffuses step 0 to N for the interpolated noise vector.
Suppose I interpolate with a small alpha
the noise init2 = init1 * (1.0-alpha) + new_noise * alpha
for a second image.
Then it would probably be a waste of time to diffuse init2
for N steps to get an almost identical image as you've got before with init1
, when I could diffuse result_latent_vector + alpha * new_noise
for only a few steps to find the second image.
Of course it is a good question how much noise you have to add to avoid getting the same result, but to explore how to work with the latent vector I would first need to understand how the latent vector can be reused. As the code uses cond_latents
as name for the seed noise, I suppose the seed is the initialization of the latent vector and it should in principle be possible to use the vector as seed for a new diffusion. But in practice it didn't work for me.
The image-to-image script img2img.py
essentially does what you're describing, you could adapt it to work with the interpolation script. It takes an image, corrupts it with a lot-or-a-little amount of noise, depending on the strength parameter, and then "resumes" the diffusion process for the corresponding number of steps.
For example, let num_inference_steps=50, strength=0.5:
img2img will:
- encode your input image (could be the output of previous interpolation step)
- corrupt it with the corresponding amount of noise you'd expect from step strength*num_inference_steps=25
- resume diffusion for the remaining 25 steps
strength=0 means that you're starting from pure noise as usual, strength will give you back the input image.
I had a look into the diffusers StableDiffusionImg2ImgPipeline
yesterday but didn't get it to work yet.
For the img2img script in the stable-diffusion
repo I first need to get the original code running as I am currently using the diffusers
implementation for everything. But maybe I can reuse some parts of the script in the diffusers-based script.
Check out my fork of diffusers - I got image-to-image to work: https://github.com/atarashansky/diffusers
Example usage (be warned, I am not using the safety checker as I found it a little too restrictive):
model_id = "CompVis/stable-diffusion-v1-4"
pipe = StableDiffusionPipeline.from_pretrained(model_id,...)
pipe = pipe.to("cuda")
prompt = "a fantasy landscape, trending on artstation"
init_image = .... # a PIL.Image object.
grid,images,seeds = pipe.make_grid(
prompt,
seed=1234,
height=512,
width=512,
num_rows=3,
num_columns=3,
num_inference_steps=59,
guidance_scale=7.5,
init_image=init_image, # set this to None to do normal text2image
strength=0.4 # this is the strength parameter.
)
One more disclaimer, this currently only works with the default scheduler.
One more disclaimer, this currently only works with the default scheduler.
Thank you, that was my problem why I didn't get the image_to_image
example class to work in an own script.
This colab might also be useful! It shows how to reuse images of the same seed:
https://colab.research.google.com/github/pcuenca/diffusers-examples/blob/main/notebooks/stable-diffusion-seeds.ipynb