|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- microsoft/ms_marco |
|
language: |
|
- en |
|
pipeline_tag: text-classification |
|
tags: |
|
- onnx |
|
- cross-encoder |
|
--- |
|
|
|
# Cross-Encoder for MS Marco - ONNX |
|
|
|
ONNX versions of [Sentence Transformers Cross Encoders](https://huggingface.co/cross-encoder) to allow ranking without heavy dependencies. |
|
|
|
The models were trained on the [MS Marco Passage Ranking](https://github.com/microsoft/MSMARCO-Passage-Ranking) task. |
|
|
|
The models can be used for Information Retrieval: Given a query, encode the query will all possible passages (e.g. retrieved with ElasticSearch). Then sort the passages in a decreasing order. See [SBERT.net Retrieve & Re-rank](https://www.sbert.net/examples/applications/retrieve_rerank/README.html) for more details. |
|
|
|
## Models Available |
|
|
|
| Model Name | Precision | File Name | File Size | |
|
|--------------------------------------|-----------|------------------------------------------|-----------| |
|
| ms-marco-MiniLM-L-4-v2 ONNX | FP32 | ms-marco-MiniLM-L-4-v2-onnx.zip | 70 MB | |
|
| ms-marco-MiniLM-L-4-v2 ONNX (Quantized) | INT8 | ms-marco-MiniLM-L-4-v2-onnx-int8.zip | 12.8 MB | |
|
| ms-marco-MiniLM-L-6-v2 ONNX | FP32 | ms-marco-MiniLM-L-6-v2-onnx.zip | 83.4 MB | |
|
| ms-marco-MiniLM-L-6-v2 ONNX (Quantized) | INT8 | ms-marco-MiniLM-L-6-v2-onnx-int8.zip | 15.2 MB | |
|
|
|
## Usage with ONNX Runtime |
|
|
|
```python |
|
import onnxruntime as ort |
|
from transformers import AutoTokenizer |
|
|
|
model_path="ms-marco-MiniLM-L-4-v2-onnx/" |
|
tokenizer = AutoTokenizer.from_pretrained('model_path') |
|
ort_sess = ort.InferenceSession(model_path + "ms-marco-MiniLM-L-4-v2.onnx") |
|
|
|
features = tokenizer(['How many people live in Berlin?', 'How many people live in Berlin?'], ['Berlin has a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers.', 'New York City is famous for the Metropolitan Museum of Art.'], padding=True, truncation=True, return_tensors="np") |
|
ort_outs = ort_sess.run(None, features) |
|
print(ort_outs) |
|
``` |
|
|
|
## Performance |
|
|
|
TBU... |