Is there an FP8 version?
#1
by
zhaoqi
- opened
The full version is too big, is there an fp8 version?
Please help with code to use Fp8 and GGUF version. These have code only for bf16 version.
https://huggingface.co/shuttleai/shuttle-3-diffusion-fp8
https://huggingface.co/shuttleai/shuttle-3-diffusion-GGUF
OSError: Error no file named model_index.json found in directory shuttleai/shuttle-3-diffusion-fp8.
Iunno, you probably should be using a backend like ComfyUI or Forge for that
This comment has been hidden
If you are using transformers, I recommend https://github.com/huggingface/optimum-quanto
Example code:
from optimum.quanto import freeze, quantize, qint8, qfloat8, qint4
quantize(
pipe.transformer,
# weights=qfloat8,
weights=qint8,
exclude=[
"*.norm", "*.norm1", "*.norm2", "*.norm2_context",
"proj_out", "x_embedder", "norm_out", "context_embedder",
],
)
freeze(pipe.transformer)
# pipe.enable_model_cpu_offload()```