File size: 2,584 Bytes
5060c83 b637639 e39fbde 20c84a2 e39fbde b637639 5060c83 b637639 5060c83 b637639 5060c83 b637639 5060c83 ccad832 5060c83 b637639 ccad832 b637639 e39fbde b637639 ccad832 b637639 7cbebc1 20c84a2 7cbebc1 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 |
---
library_name: transformers
base_model: "cross-encoder/ms-marco-MiniLM-L-12-v2"
model-index:
- name: esci-ms-marco-MiniLM-L-12-v2
results:
- task:
type: Reranking
metrics:
- type: mrr@10
value: 91.74
- type: ndcg@10
value: 84.83
tags: ["cross-encoder", "search", "product-search"]
---
# Model Descripton
<!-- Provide a quick summary of what the model is/does. -->
Fine tunes a cross encoder on the Amazon ESCI dataset.
# Usage
## Transformers
<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
from torch import no_grad
model_name = "lv12/esci-ms-marco-MiniLM-L-12-v2"
queries = [
"adidas shoes",
"adidas sambas",
"girls sandals",
"backpacks",
"shoes",
"mustard blouse"
]
documents = [
"Nike Air Max, with air cushion",
"Adidas Ultraboost, the best boost you can get",
"Women's sandals wide width 9",
"Girl's surf backpack",
"Fresh watermelon, all you can eat",
"Floral yellow dress with frills and lace"
]
model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
inputs = tokenizer(
queries,
documents,
padding=True,
truncation=True,
return_tensors="pt",
)
model.eval()
with no_grad():
scores = model(**inputs).logits.cpu().detach().numpy()
print(scores)
```
### Sentence Transformers
```python
from sentence_transformers import CrossEncoder
model_name = "lv12/esci-ms-marco-MiniLM-L-12-v2"
queries = [
"adidas shoes",
"adidas sambas",
"girls sandals",
"backpacks",
"shoes",
"mustard blouse"
]
documents = [
"Nike Air Max, with air cushion",
"Adidas Ultraboost, the best boost you can get",
"Women's sandals wide width 9",
"Girl's surf backpack",
"Fresh watermelon, all you can eat",
"Floral yellow dress with frills and lace"
]
model = CrossEncoder(model_name, max_length=512)
scores = model.predict([(q, d) for q, d in zip(queries, documents)])
print(scores)
```
## Training
Trained using MSELoss using `<query, document>` pairs with `grade` as the label.
```python
from sentence_transformers import InputExample
train_samples = [
InputExample(texts=["query 1", "document 1"], label=0.3),
InputExample(texts=["query 1", "document 2"], label=0.8),
InputExample(texts=["query 2", "document 2"], label=0.1),
]
```` |