Transformers documentation

Optimum

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Optimum

The Optimum library supports quantization for Intel, Furiosa, ONNX Runtime, GPTQ, and lower-level PyTorch quantization functions. Consider using Optimum for quantization if you’re using specific and optimized hardware like Intel CPUs, Furiosa NPUs or a model accelerator like ONNX Runtime.

< > Update on GitHub