|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
library_name: transformers |
|
--- |
|
# PubMedBERT Embeddings Matryoshka - ONNX - O4 |
|
|
|
O4 optimized weights of [`NeuML/pubmedbert-base-embeddings-matryoshka`](https://huggingface.co/NeuML/pubmedbert-base-embeddings-matryoshka). |
|
|
|
## Usage |
|
|
|
```python |
|
from optimum.onnxruntime import ORTModelForFeatureExtraction |
|
from transformers import AutoTokenizer |
|
import torch |
|
|
|
# Mean Pooling - Take attention mask into account for correct averaging |
|
def meanpooling(output, mask): |
|
embeddings = output[0] # First element of model_output contains all token embeddings |
|
mask = mask.unsqueeze(-1).expand(embeddings.size()).float() |
|
return torch.sum(embeddings * mask, 1) / torch.clamp(mask.sum(1), min=1e-9) |
|
|
|
# Sentences we want sentence embeddings for |
|
sentences = ['This is an example sentence', 'Each sentence is converted'] |
|
|
|
model = ORTModelForFeatureExtraction.from_pretrained("hooman650/pubmedbert-base-embeddings-matryoshka-onnx-04",provider="CUDAExecutionProvider") |
|
tokenizer = AutoTokenizer.from_pretrained("hooman650/pubmedbert-base-embeddings-matryoshka-onnx-04") |
|
|
|
# Tokenize sentences |
|
inputs = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt').to("cude") # if on GPU |
|
|
|
# Compute token embeddings |
|
with torch.no_grad(): |
|
output = model(**inputs) |
|
|
|
# Perform pooling. In this case, mean pooling. |
|
embeddings = meanpooling(output, inputs['attention_mask']) |
|
|
|
# Requested matryoshka dimensions |
|
dimensions = 256 |
|
|
|
print("Sentence embeddings:") |
|
print(embeddings[:, :dimensions]) |
|
|
|
``` |