svilupp
/

onnx-cross-encoders

Text Classification

Model card Files Files and versions Community

onnx-cross-encoders / README.md

svilupp's picture

Update README.md

776fda4 verified 6 months ago

|

history blame contribute delete

2.1 kB

	---
	license: apache-2.0
	datasets:
	- microsoft/ms_marco
	language:
	- en
	pipeline_tag: text-classification
	tags:
	- onnx
	- cross-encoder
	---

	# Cross-Encoder for MS Marco - ONNX

	ONNX versions of [Sentence Transformers Cross Encoders](https://huggingface.co/cross-encoder) to allow ranking without heavy dependencies.

	The models were trained on the [MS Marco Passage Ranking](https://github.com/microsoft/MSMARCO-Passage-Ranking) task.

	The models can be used for Information Retrieval: Given a query, encode the query will all possible passages (e.g. retrieved with ElasticSearch). Then sort the passages in a decreasing order. See [SBERT.net Retrieve & Re-rank](https://www.sbert.net/examples/applications/retrieve_rerank/README.html) for more details.

	## Models Available

	\| Model Name \| Precision \| File Name \| File Size \|
	\|--------------------------------------\|-----------\|------------------------------------------\|-----------\|
	\| ms-marco-MiniLM-L-4-v2 ONNX \| FP32 \| ms-marco-MiniLM-L-4-v2-onnx.zip \| 70 MB \|
	\| ms-marco-MiniLM-L-4-v2 ONNX (Quantized) \| INT8 \| ms-marco-MiniLM-L-4-v2-onnx-int8.zip \| 12.8 MB \|
	\| ms-marco-MiniLM-L-6-v2 ONNX \| FP32 \| ms-marco-MiniLM-L-6-v2-onnx.zip \| 83.4 MB \|
	\| ms-marco-MiniLM-L-6-v2 ONNX (Quantized) \| INT8 \| ms-marco-MiniLM-L-6-v2-onnx-int8.zip \| 15.2 MB \|

	## Usage with ONNX Runtime

	```python
	import onnxruntime as ort
	from transformers import AutoTokenizer

	model_path="ms-marco-MiniLM-L-4-v2-onnx/"
	tokenizer = AutoTokenizer.from_pretrained('model_path')
	ort_sess = ort.InferenceSession(model_path + "ms-marco-MiniLM-L-4-v2.onnx")

	features = tokenizer(['How many people live in Berlin?', 'How many people live in Berlin?'], ['Berlin has a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers.', 'New York City is famous for the Metropolitan Museum of Art.'], padding=True, truncation=True, return_tensors="np")
	ort_outs = ort_sess.run(None, features)
	print(ort_outs)
	```

	## Performance

	TBU...