Update README.md
Browse filesadd note about weights compression parameters
README.md
CHANGED
@@ -10,6 +10,16 @@ language:
|
|
10 |
## Description
|
11 |
This is [Mixtral-8x7b-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) model converted to [OpenVINO](https://docs.openvino.ai/2024/home.html) Intermediate Representation (IR) format with INT4 compressed weights using [NNCF](https://github.com/openvinotoolkit/nncf).
|
12 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
13 |
## Compatibility
|
14 |
|
15 |
This provided IR is compatible with openvino starting with 2024.0.0 version and optimum-intel 1.16.0
|
|
|
10 |
## Description
|
11 |
This is [Mixtral-8x7b-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) model converted to [OpenVINO](https://docs.openvino.ai/2024/home.html) Intermediate Representation (IR) format with INT4 compressed weights using [NNCF](https://github.com/openvinotoolkit/nncf).
|
12 |
|
13 |
+
## Quantization Configuration
|
14 |
+
|
15 |
+
Model weights was compressed to INT4 precision using `nncf.compress_weights` with the following parameters:
|
16 |
+
|
17 |
+
* mode: **INT4_SYM**
|
18 |
+
* group_size: **128**
|
19 |
+
* ratio: **0.8**
|
20 |
+
|
21 |
+
More details about optimization parameters can be found in [OpenVINO model optimization guide](https://docs.openvino.ai/2024/openvino-workflow/model-optimization-guide/weight-compression.html)
|
22 |
+
|
23 |
## Compatibility
|
24 |
|
25 |
This provided IR is compatible with openvino starting with 2024.0.0 version and optimum-intel 1.16.0
|