zeroshot commited on
Commit
de3c702
1 Parent(s): d5ec521

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -1625,6 +1625,8 @@ language:
1625
  <img src="https://huggingface.co/zeroshot/bge-small-en-v1.5-quant/resolve/main/latency.png" alt="latency" width="600" style="display:inline-block; margin-right:10px;"/>
1626
  </div>
1627
 
 
 
1628
  ## Usage
1629
 
1630
  This is the quantized (INT8) ONNX variant of the [bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) embeddings model accelerated with [Sparsify](https://github.com/neuralmagic/sparsify) for quantization and [DeepSparseSentenceTransformers](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/sentence_transformers) for inference.
 
1625
  <img src="https://huggingface.co/zeroshot/bge-small-en-v1.5-quant/resolve/main/latency.png" alt="latency" width="600" style="display:inline-block; margin-right:10px;"/>
1626
  </div>
1627
 
1628
+ DeepSparse improves latency performance by 3X on 10 core laptop and 5X on an AWS instance.
1629
+
1630
  ## Usage
1631
 
1632
  This is the quantized (INT8) ONNX variant of the [bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) embeddings model accelerated with [Sparsify](https://github.com/neuralmagic/sparsify) for quantization and [DeepSparseSentenceTransformers](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/sentence_transformers) for inference.