[bug] The custom pipeline lpw_stable_diffusion_xl is not currently taking effect.

#33
by skytnt - opened

The StableDiffusionXLPipeline.from_single_file method does not accept the custom_pipeline parameter; it is invalid.

https://huggingface.co/docs/diffusers/v0.30.0/en/api/loaders/single_file#diffusers.loaders.FromSingleFileMixin.from_single_file

If you run the following code

import torch
from diffusers import DiffusionPipeline
from diffusers.models import AutoencoderKL
from diffusers import StableDiffusionXLPipeline, StableDiffusionXLImg2ImgPipeline

def load_pipeline(model_name):
    vae = AutoencoderKL.from_pretrained(
        "madebyollin/sdxl-vae-fp16-fix",
        torch_dtype=torch.float16,
    )
    pipeline = (
        StableDiffusionXLPipeline.from_single_file
        if model_name.endswith(".safetensors")
        else StableDiffusionXLPipeline.from_pretrained
    )

    pipe = pipeline(
        model_name,
        vae=vae,
        torch_dtype=torch.float16,
        custom_pipeline="lpw_stable_diffusion_xl",
        use_safetensors=True,
        add_watermarker=False
    )

    pipe.to("cuda")
    return pipe

pipe = load_pipeline("https://huggingface.co/cagliostrolab/animagine-xl-3.1/blob/main/animagine-xl-3.1.safetensors")
prompt = "1girl, animal ear fluff, fox girl, kitsune, fox ears, fox tail, hair ornament,tail ornament, white hair, long hair, ahoge, blush, ribbon, hair ribbon, neck ribbon, bow, heterochromia, smiling, open mouth, bare shoulders, detached sleeves,sideboob, furisode, hakama,torii, huge breast, umbrella,"
negative_prompt = "nsfw, lowres, (bad), text, error, fewer, extra, missing, worst quality, jpeg artifacts, low quality, watermark, unfinished, displeasing, oldest, early, chromatic aberration, signature, extra digits, artistic error, username, scan"

image = pipe(
    prompt, 
    negative_prompt=negative_prompt,
    width=832,
    height=1216, 
    guidance_scale=7,
    num_inference_steps=28
).images[0]

you will see
Token indices sequence length is longer than the specified maximum sequence length for this model (82 > 77). Running this sequence through the model will result in indexing errors The following part of your input was truncated because CLIP can only handle sequences up to 77 tokens: ['huge breast, umbrella,']

This means the pipe is not lpw_stable_diffusion_xl.

you can use pipe = load_pipeline("cagliostrolab/animagine-xl-3.1") to fix it.

skytnt changed discussion title from The custom pipeline lpw_stable_diffusion_xl is not currently taking effect. Can you fix it? to [bug] The custom pipeline lpw_stable_diffusion_xl is not currently taking effect.
Cagliostro Research Lab org
โ€ข
edited Aug 25

Hi, you're right. lpw_stable_diffusion_xl is not taking effect. Actually, I've noticed this since 4 months ago and notified our Gradio dev to fix it. I think we either accidentally let this one slip or didn't fix it for some reason.

Let me notify our Gradio dev again.

Excuse me for the sidebar.
from_single_file and from_pretrained are not Gradio features, they are Transformers or Diffusers features.
It would be better to let the developers of Diffusers know.

Cagliostro Research Lab org

I already let the Diffusers dev know since April (https://github.com/huggingface/diffusers/issues/7666).

What I mean by our Gradio dev is the guy building the demo app.

So that's what you meant...
Sorry for the misunderstanding.๐Ÿ˜“

I had a quick look at the code, and it looks like the author has made it so that from_pretrained can also be used.
If you don't care about from_single_file, just set the space environment variable MODEL to cagliostrolab/animagine-xl-3.1 and it should work fine as skytnt also said.
Currently it is probably set to ~.safetensors.
I've also overheard reports of problems with the community pipeline in general (lpw_~), so it's possible that that won't fix it, but that's beside the point.

We shouldn't take his job away from him, though, if he has a full-time maintainer to begin with.

to skytnt

Thanks for the concrete example of the code. The actual code looks like this, so it would be quicker to fix it as below.
The original author of the program seems to be assuming that it's set to an environment variable...
Why not just make it work?

MODEL = os.getenv(
    "MODEL",
    "https://huggingface.co/cagliostrolab/animagine-xl-3.1/blob/main/animagine-xl-3.1.safetensors",
)

โ†“

MODEL = "cagliostrolab/animagine-xl-3.1"
Cagliostro Research Lab org

Hi, I decided to fix this by myself. Should be working now. Let me know if something doesn't seems right.

Hello. I think it is working fine. If I may say so, this part works as it was before you fixed it. (The author has branched out to use from_pretrained if the MODEL string is not ending with ".safetensors")

    pipeline = (
        StableDiffusionXLPipeline.from_single_file
        if MODEL.endswith(".safetensors")
        else StableDiffusionXLPipeline.from_pretrained
    )

If it is an individual Repo, I can make a PR on my own, but in an organization, it is not so easy for an outsider to do so...

Cagliostro Research Lab org

The code should be back now.
But I think HF is having a resource issue making the space cannot start properly.

Will try to restart the space until the GPU quotas are back to normal.

https://huggingface.co/posts/bartowski/524900219749834#66cfaa865594967fbdedc542
As I mentioned in another post, there's been a major change in the Zero GPU space at HF, but no real announcement.๐Ÿ™€
So I put a PR out there.

For some reason, several Japanese are going around repairing repos all over the place with PR.
Perhaps there's some info circulating on an outside forum or SNS or news site?
I don't know because I haven't checked recently either, but I can't help but think of a few places like that.

Cagliostro Research Lab org
โ€ข
edited Aug 29

For our case, the reason the space starts very slow is because the space have to generate 6 samples every time it boots (don't ask me why this decision was made in the first place).
So when the ZeroGPU resources on HF is capped so bad, it'll crash on boot.

Now it's finally running well.

EDIT: I spoke too fast, let me fix it according the post.

I also PR'd the code that should prevent sample generation. If it's an organizational decision, not a HF spec change, I can't help but wonder...

Cagliostro Research Lab org

I've forwarded the runtime error issue to our Gradio dev. So I'll let the expert work.

That would be good.
We can't rule out that this is the only problem, nor can we rule out that this is the end of the problems that have been so frequent of late. Maintainers should be on the warpath.

Thanks for your hard work, the space seems to be working fine now.

Why remove add_watermarker=False in the commit?
image.png

Having said that, you don't have to delete use_safetensors either.
Because Diffusers files are (now) safetensors, too, although their contents are different. It's no different from gif and png in terms of image files.

Cagliostro Research Lab org

Why remove add_watermarker=False in the commit?
image.png

My bad.

Cagliostro Research Lab org

Having said that, you don't have to delete use_safetensors either.
Because Diffusers files are (now) safetensors, too, although their contents are different. It's no different from gif and png in terms of image files.

I was removing it because the commit mentioned doesn't use single file, well it's now back so maybe adding it back is a good idea.

Sorry for the mess caused. I'm having a bad flu and can't think really straight earlier.

Take care.๐Ÿ˜ฐ

Well, safetensors=True or not, it should have no effect as long as bin is not in the same folder as safetensors. (There isn't, is there?)
In the extreme, it should only be a function to specify which one to load.
If you specify fp16 in the variant, then .fp16.safetensors will be loaded, which may be useful for multi-model Spaces, etc., but if it is alone, as here, it should have virtually no effect.

If you need to save VRAM, you should specify torch_dtype=torch.float16 when loading.

In any case, it is better to leave it to the maintainer.
After all, familiarity with the library is all that determines whether or not you can do it right (in programming these days, not in the past).

kayfahaarukku changed discussion status to closed

Sign up or log in to comment