Universal Sentence Encoder Multilingual v3
ONNX version of https://tfhub.dev/google/universal-sentence-encoder-multilingual/3
The original TFHub version of the model is referenced in other models here E.g. https://huggingface.co/vprelovac/universal-sentence-encoder-multilingual-3
Overview
See overview and license details at https://tfhub.dev/google/universal-sentence-encoder-multilingual/3
This model is a full precision version of the TFHub original, in ONNX format.
It uses the ONNXRuntime Extensions to embed the tokenizer within the ONNX model, so no seperate tokenizer is needed, and text is fed directly into the ONNX model.
Post-processing (E.g. pooling, normalization) is also implemented within the ONNX model, so no separate processing is necessary.
How to use
import onnxruntime as ort
from onnxruntime_extensions import get_library_path
from os import cpu_count
sentences = ["hello world"]
def load_onnx_model(model_filepath):
_options = ort.SessionOptions()
_options.inter_op_num_threads, _options.intra_op_num_threads = cpu_count(), cpu_count()
_options.register_custom_ops_library(get_library_path())
_providers = ["CPUExecutionProvider"] # could use ort.get_available_providers()
return ort.InferenceSession(path_or_bytes=model_filepath, sess_options=_options, providers=_providers)
model = load_onnx_model("filepath_for_model_dot_onnx")
model_outputs = model.run(output_names=["outputs"], input_feed={"inputs": sentences})[0]
print(model_outputs)
Inference API (serverless) has been turned off for this model.