Snowflake
/

snowflake-arctic-embed-m-v1.5

@@ -7646,7 +7646,7 @@ Additionally, this model was designed to pair well with a corpus-independent sca
 | v1.5            |              256 | int8                  | 256 (8.3%)                                | 54.2 (99%)                                    | 3.9M (12x)                                   |
 | v1.5            |              256 | int4                  | 128 (4.2%)                                | 53.7 (98%)                                    | 7.8M (24x)                                   |
-NOTE: A good uniform scalar quantization range to use with this model (and which was used in the eval above), is -0.18 to 0.18. For a detailed walkthrough of int4 quantization with `snowflake-arctic-embed-m-v1.5`, check out our [example notebook on GitHub](https://github.com/Snowflake-Labs/arctic-embed/tree/main/compressed_embeddings_examples/score_arctic_embed_m_v1dot5_with_quantization.ipynb).
 ## Usage
@@ -7840,6 +7840,7 @@ console.log(similarities); // [0.15664823859882132, 0.24481869975470627]
 This model is designed to generate embeddings which compress well down to 128 bytes via a two-part compression scheme:
 1. Truncation and renormalization to 256 dimensions (a la Matryoskha Representation Learning, see [the original paper for reference](https://arxiv.org/abs/2205.13147)).
 2. 4-bit uniform scalar quantization of all 256 values to the same range (-0.18 to +0.18).
 For an in-depth examples, check out our [arctic-embed GitHub repositiory](https://github.com/Snowflake-Labs/arctic-embed).

 | v1.5            |              256 | int8                  | 256 (8.3%)                                | 54.2 (99%)                                    | 3.9M (12x)                                   |
 | v1.5            |              256 | int4                  | 128 (4.2%)                                | 53.7 (98%)                                    | 7.8M (24x)                                   |
+NOTE: Good uniform scalar quantization ranges to use with this model (and which were used in the eval above), are -0.18 to +0.18 for 4bit and -0.3 to +0.3 for 8bit. For a detailed walkthrough of using integer quantization with `snowflake-arctic-embed-m-v1.5`, check out our [example notebook on GitHub](https://github.com/Snowflake-Labs/arctic-embed/tree/main/compressed_embeddings_examples/score_arctic_embed_m_v1dot5_with_quantization.ipynb).
 ## Usage
 This model is designed to generate embeddings which compress well down to 128 bytes via a two-part compression scheme:
 1. Truncation and renormalization to 256 dimensions (a la Matryoskha Representation Learning, see [the original paper for reference](https://arxiv.org/abs/2205.13147)).
 2. 4-bit uniform scalar quantization of all 256 values to the same range (-0.18 to +0.18).
+   - For 8-bit uniform scalar quantization, the slightly wider range -0.3 to +0.3 tends to work slightly better given how much more granular 8-bit quantization is.
 For an in-depth examples, check out our [arctic-embed GitHub repositiory](https://github.com/Snowflake-Labs/arctic-embed).