shariqmobin commited on
Commit
d6f7c27
1 Parent(s): e6a4ff7

Update README.md

Browse files

According to the LLM Compressor docs the save_compressed=True flag should be present as shown in this example from them: https://github.com/vllm-project/llm-compressor/tree/main/examples/quantization_w8a8_int8

I believe this is an issue with https://huggingface.co/neuralmagic/Meta-Llama-3.1-70B-Instruct-quantized.w8a8 also.

Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -126,7 +126,7 @@ oneshot(
126
  num_calibration_samples=num_samples,
127
  )
128
 
129
- model.save_pretrained("Meta-Llama-3.1-8B-Instruct-quantized.w8a8")
130
  ```
131
 
132
 
 
126
  num_calibration_samples=num_samples,
127
  )
128
 
129
+ model.save_pretrained("Meta-Llama-3.1-8B-Instruct-quantized.w8a8", save_compressed=True))
130
  ```
131
 
132