Edit model card

Still not small enough? Try https://huggingface.co/imnotednamode/mochi-1-preview-mix-nf4-small

This mixes mochi with a development version of diffusers to achieve high quality fast inference with the full 161 frames on a single 24gb card. This repo contains only the transformer. After installing the mochi development branch with pip install git+https://github.com/huggingface/diffusers@mochi, it can be loaded normally and used in a pipeline like so:

from diffusers import MochiPipeline, MochiTransformer3DModel
from diffusers.utils import export_to_video
transformer = MochiTransformer3DModel.from_pretrained("imnotednamode/mochi-1-preview-mix-nf4", torch_dtype=torch.bfloat16)
pipe = MochiPipeline.from_pretrained("mochi-1-diffusers", torch_dtype=torch.bfloat16, transformer=transformer)
pipe.enable_model_cpu_offload()
pipe.enable_vae_tiling()
frames = pipe("A camera follows a squirrel running around on a tree branch", num_inference_steps=100, guidance_scale=4.5, height=480, width=848, num_frames=161).frames[0]
export_to_video(frames, "mochi.mp4", fps=15)

In the above, you must also use the convert_mochi_to_diffuser.py script from https://github.com/huggingface/diffusers/pull/9769 to convert https://huggingface.co/genmo/mochi-1-preview to the diffusers format.

I've noticed raising the guidance_scale will allow the model to make a coherent output with less steps, but also reduces motion, as the model is trying to align mostly with the text prompt. This can also, to an extent, improve the degradation of using full nf4 weights.

This version works by mixing nf4 weights and bf16 weights together. I notice that using pure nf4 weights degrades the model quality significantly, but using bf16 or LLM.int8 weights means the full 161 frames can't fit into vram. This version strikes a balance (Everything except a few blocks is in bf16).

Here's a comparison

bf16:

nf4mix (this one):

LLM.int8:

nf4:

fp4:

Downloads last month
33
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for imnotednamode/mochi-1-preview-mix-nf4

Quantized
(2)
this model