Quantized model too big?
#1
by
D4ve-R
- opened
Hi, I noticed that the quantized model is 1,4 gb and the normal model only 140 mb. Shouldn’t it be the other way ?
The unquantized model is actually around ~6.5GB: the weights (decoder_model_merged.onnx_data) are stored separately due to protobuf limitations.
Ok, got you! Sorry, seems like I need to do some research on how onnx works.
D4ve-R
changed discussion status to
closed