TencentARC/t2i-adapter-depth-midas-sdxl-1.0 · Different images while using same latents

Sep 18, 2023

•

edited Sep 18, 2023

Hi!

When asking to generate two images while using the same (duplicated) latents, I obtain two diffrents images.

latents = torch.randn((1, 4, 128, 128), device='cuda').half().repeat(2, 1, 1, 1)

When digging in the code it looks like I have different noise predicted by the UNet while it has twice the same input.

Is it normal?

Where does this variation come from?

Thank you !!

TheoC changed discussion title from Different image while using same latents to Different images while using same latents Sep 18, 2023

Adapter

ARC Lab, Tencent PCG org Sep 19, 2023

It is not two images but text and none-text latents for classifier-free guidance.

TheoC

Sep 19, 2023

•

edited Sep 19, 2023

Thank you for you answer !! I am not sure to understand though. Let me be more specific.

I am generating two images by setting

num_images_per_prompt = 2

in the StableDiffusionXLAdapterPipeline call.

I have a single prompt. I also provide the latents argument for the pipe, which is the same for each image.

Therefore the input of the UNet is basically the same, yet the predicted noise differs.
INPUT (latent_model_input):

torch.Size([4, 4, 128, 128])
tensor([[-0.2959, -0.2959, -0.2959, -0.2959, -0.2959],
        [-0.2959, -0.2959, -0.2959, -0.2959, -0.2959],
        [-0.2959, -0.2959, -0.2959, -0.2959, -0.2959],
        [-0.2959, -0.2959, -0.2959, -0.2959, -0.2959]], device='cuda:0',
       dtype=torch.float16)

OUTPUT (noise_pred):

torch.Size([4, 4, 128, 128])
tensor([[-0.2391, -0.1351, -0.1200, -0.1201, -0.1230],
        [-0.2391, -0.1351, -0.1200, -0.1201, -0.1230],
        [-0.2365, -0.1348, -0.1201, -0.1203, -0.1234],
        [-0.2365, -0.1348, -0.1201, -0.1203, -0.1234]], device='cuda:0',
       dtype=torch.float16)

Is there some source of randomness in the UNet pipeline?

Best,
Théo

Adapter

ARC Lab, Tencent PCG org Sep 20, 2023

Yeah, the first two images are non-text of the last two images, will be ignored.