How to fix "RuntimeError: expected scalar type Half but found Float" when using fp16
Replace line 272-273 in <pythondistr>\Lib\site-packages\torch\nn\modules\normalization.py
return F.group_norm(
input, self.num_groups, self.weight, self.bias, self.eps)
with
return F.group_norm(
input, self.num_groups, self.weight.type(input.dtype), self.bias.type(input.dtype), self.eps)
In <pythondistr>\Lib\site-packages\diffusers\pipelines\stable_diffusion\pipeline_stable_diffusion.py
after this section (around line 102-107)
latents = torch.randn(
(batch_size, self.unet.in_channels, height // 8, width // 8),
generator=generator,
device=self.device,
)
add
latents = latents.half()
Finally, In <pythondistr>\Lib\site-packages\diffusers\pipelines\stable_diffusion\pipeline_stable_diffusion.py
replace this (on line 160-161)
safety_cheker_input = self.feature_extractor(self.numpy_to_pil(image), return_tensors="pt").to(self.device)
image, has_nsfw_concept = self.safety_checker(images=image, clip_input=safety_cheker_input.pixel_values)
with
safety_cheker_input = self.feature_extractor(self.numpy_to_pil(image), return_tensors="pt").to(self.device)
safety_cheker_input.pixel_values = safety_cheker_input.pixel_values.half()
image, has_nsfw_concept = self.safety_checker(images=image, clip_input=safety_cheker_input.pixel_values)
Hug
Hey @TessaCoil ,
Thanks for the fix here! Does it happen when loading weights in torch.float16
?
Could you maybe post a code snippet that currently leads to an error/bug? :-)
It happens when you try to switch to cpu I think in one instance - likely the self hosted - that I have seen bemoaned "in the wild". @patrickvonplaten rather than included GPU driven default selection. As I understood it.
setting it to CPU then complains about no support for halfs or vice versa. this looks like a fix for that. First glance.
Here is a code snippet that causes the error.
import torch
from diffusers import StableDiffusionPipeline
TOKEN = 'hugging_face_token'
# get your token at https://huggingface.co/settings/tokens
def run():
pipe = StableDiffusionPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4",
revision="fp16",
torch_dtype=torch.float16,
use_auth_token=TOKEN,
).to("cuda")
prompt = "a photo of an astronaut riding a horse on mars"
image = pipe(prompt)["sample"][0]
image.save("astronaut_rides_horse.png")
# Press the green button in the gutter to run the script.
if __name__ == '__main__':
run()
@TessaCoil
- I get the same error around line 82 (<pythondistr>\Lib\site-packages\diffusers\pipelines\stable_diffusion\pipeline_stable_diffusion.py
):
text_embeddings = self.text_encoder(text_input.input_ids.to(self.device))[0]
so none of the upcoming modifications are reached. What do you reckon should I change?
Edit: I forgot to wrap pipe(prompt)["sample"][0]
around autocast("cuda")
.
hey i'm facing the similar issue for 'cpu' device... in https://huggingface.co/spaces/nightfury/SD-InPainting/blob/main/app.py
as no gpu - 'Cuda' available.
if i set torch_dtype=torch.float16,
thn it throws
RuntimeError: expected scalar type Float but found BFloat16
if i set torch_dtype=torch.bfloat16,
thn it throws
RuntimeError: expected scalar type BFloat16 but found Float,
if i set torch_dtype=torch.half,
thn it throws
RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'
if i set torch_dtype=torch.double,
thn it throws
RuntimeError: expected scalar type BFloat16 but found Double
if i set torch_dtype=torch.long,
thn it throws
raise TypeError('nn.Module.to only accepts floating point or complex '
TypeError: nn.Module.to only accepts floating point or complex dtypes, but got desired dtype=torch.int64
so i am really confused on what torch_dtype to use for successful run.
I came across the same error. I am also using diffusers 1.4. I added the with torch.autocast("cuda"):
line above the pipe(prompt, latents=latents)
and problem solved.
thanks, i also add with torch.autocast("cuda"): , works for me
Hi! I Have a problem : Input type (float) and bias type (struct c10::Half) should be the same
the error is in \Lib\site-packages\torch\nn\modules\conv.py
File "C:\Users\user\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\conv.py", line 463, in forward
return self._conv_forward(input, self.weight, self.bias) File "C:\Users\user\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\conv.py", line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (float) and bias type (struct c10::Half) should be the same
END OF TRACEBACK
do u know what I can do here?
Hi @MohammadMi ! As others mentioned, this usually happens when attempting to run the model in half precision on CPU, because CPU does not support half floats. Do you have a GPU in your computer, and are you trying to use it? Do you have a code snippet that demonstrates the problem?
Yes I have a GPU - GTX 1060Ti
I didn’t change any settings..
Do u need to see my webui-user ? I set —no-half —lowvram —opt-slipt-attention..
Which code snippet do u mean?
I met the problem when using concurrent.futures
for multi-thread inferencing, I cannot solve the bug yet.
But when setting num_workers = 1
, everything goes fine.
Hi, thanks for your wonderful model and weight. I came across similar problem:
It always told me mat1 and mat2 not match and half with float stuff. I checked and unet related params are all float16, inputs are float16. When changed to float32 it works but loss the speed, could you tell me the possible problem and solution?
controlnet = ControlNetModel.from_pretrained("lllyasviel/sd-controlnet-depth", torch_dtype=weight_dtype)
pipeline = StableDiffusionControlNetPipeline.from_pretrained(
self.config.diffusion_ckpt,
vae=vae,
text_encoder=text_encoder,
tokenizer=tokenizer,
unet=unet,
controlnet=controlnet,
safety_checker=None,
# revision=args.revision,
# variant=args.variant,
torch_dtype=weight_dtype,
)
# pipeline.scheduler = UniPCMultistepScheduler.from_config(pipeline.scheduler.config)
pipeline = pipeline.to(accelerator.device)
# disp = disp.to(torch.float32)
# self.tmp_pipe.to(torch.float32)
latent=latent.to(torch.float32)
disp=disp.to(torch.float32)
pipeline.to(torch.float32)
result = pipeline(
prompt=[self.positive_prompt] ,
negative_prompt=[self.negative_prompts],
latents=latent,
image=disp,
num_inference_steps=self.num_inference_steps,
guidance_scale=self.guidance_scale,
controlnet_conditioning_scale=self.controlnet_conditioning_scale,
eta=self.eta,
output_type='pt',
).images[0]