requirements?
#7
by
vladmandic
- opened
what are the requirements for this model?
- running
I2VGenXLPipeline
in FP16produces NaNs - running in FP32 works, but even at 640x352 (which is 1/4 of models native resolution) and
decode_chunk_size=1
it pegs the GPU to ~24gb of used VRAM
running I2VGenXLPipeline in FP16produces NaNs
Not sure if that's the case. More details below.
running in FP32 works, but even at 640x352 (which is 1/4 of models native resolution) and decode_chunk_size=1 it pegs the GPU to ~24gb of used VRAM
Again, not sure if that's the case. More details below.
My script:
import torch
from diffusers.utils import load_image, export_to_gif
from diffusers import I2VGenXLPipeline
def bytes_to_giga_bytes(bytes):
return f"{(bytes / 1024 / 1024 / 1024):.3f}"
pipeline = I2VGenXLPipeline.from_pretrained("ali-vilab/i2vgen-xl", torch_dtype=torch.float16, variant="fp16")
pipeline.enable_model_cpu_offload()
image_url = "https://github.com/ali-vilab/i2vgen-xl/blob/main/data/test_images/img_0009.png?raw=true"
image = load_image(image_url).convert("RGB")
prompt = "Papers were floating in the air on a table in the library"
negative_prompt = "Distorted, discontinuous, Ugly, blurry, low resolution, motionless, static, disfigured, disconnected limbs, Ugly faces, incomplete arms"
generator = torch.manual_seed(8888)
frames = pipeline(
prompt=prompt,
image=image,
negative_prompt=negative_prompt,
generator=generator
).frames[0]
video_path = export_to_gif(frames, "i2v.gif")
memory = bytes_to_giga_bytes(torch.cuda.max_memory_allocated())
print(f"Memory: {memory}GB")
Prints 11.714GB on a 4090.
And the following is the output: https://huggingface.co/datasets/sayakpaul/sample-datasets/blob/main/i2v.gif