RuntimeError: cutlassF: no kernel found to launch!
When running the example code on Google colab with T4 GPU , it is throwing RuntimeError: cutlassF: no kernel found to launch!
for the prior inference step.
Try to add
torch.backends.cuda.enable_mem_efficient_sdp(False)
torch.backends.cuda.enable_flash_sdp(False)
Adding those 2 lines fixed it for me.
I opened a PR a few minutes before...
torch.backends.cuda.enable_mem_efficient_sdp(False)
torch.backends.cuda.enable_flash_sdp(False)
Working, but now I am just running out of RAM before the decoder
try to reduce num_images_per_prompt to 1
Nope, not enough RAM for this ig
I think the answer is here: https://huggingface.co/stabilityai/stable-cascade/discussions/3
It works in a T4 colab for me when I install accelerate.
Example Colab here: https://colab.research.google.com/drive/1qV14_OzZDNx6G-Lx2NE2Imk_7dfDbwkm?usp=sharing
It works in a T4 colab for me when I install accelerate.
Example Colab here: https://colab.research.google.com/drive/1qV14_OzZDNx6G-Lx2NE2Imk_7dfDbwkm?usp=sharing
T4 not work for me
OutOfMemoryError: CUDA out of memory. Tried to allocate 40.00 MiB.
Adding those 2 lines fixed it for me.
I opened a PR a few minutes before...torch.backends.cuda.enable_mem_efficient_sdp(False)
torch.backends.cuda.enable_flash_sdp(False)
Where is this supposed to go :P ?
where to add this two line
torch.backends.cuda.enable_mem_efficient_sdp(False)
torch.backends.cuda.enable_flash_sdp(False)
Please have a look in the example colab referenced above