Quantized model too big?

by D4ve-R - opened Mar 4

D4ve-R

Mar 4

Hi, I noticed that the quantized model is 1,4 gb and the normal model only 140 mb. Shouldn’t it be the other way ?

Xenova

Owner Mar 4

The unquantized model is actually around ~6.5GB: the weights (decoder_model_merged.onnx_data) are stored separately due to protobuf limitations.

D4ve-R

Mar 4

Ok, got you! Sorry, seems like I need to do some research on how onnx works.

D4ve-R changed discussion status to closed Mar 4

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment