Updates to minimal quantization script.
#1
by
nickfraser
- opened
The following updates to the minimal quantization script:
- Reorg files (move to subdirectory)
- Added optional calibration set size (small size is useful to detect algorithmic changes)
- Added validation with optional size (small size is useful to detect algorithmic changes)
- Fix: added quantization to the output of QuantLoraLinear layers (when --quantize-sdp enabled)
- Updated dependencies, added conda environment files
- Added pre-generated captions (from Nvidia's previous submission)
- Added basic README for reproducing results
- Checked that the right quantized structure is generated for Int8+FP8 mode
- Checked that the right quantized structure is generated for Int8 mode
- Checked that the compliant accuracy is achieved for Int8+FP8
- Checked that the compliant accuracy is achieved for Int8
- Checked that dataset setup instructions (from MLPerf) work & produce the correct directories / files
GiusFra
changed pull request status to
open
GiusFra
changed pull request status to
merged