juanluisdb
commited on
Commit
•
3b40d1b
1
Parent(s):
5c9c5e5
Update README.md
Browse files
README.md
CHANGED
@@ -23,7 +23,7 @@ using [bge-reranker-v2-m3](https://huggingface.co/BAAI/bge-reranker-v2-m3) as te
|
|
23 |
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
24 |
import torch
|
25 |
model = AutoModelForSequenceClassification.from_pretrained("juanluisdb/MiniLM-L-6-rerank-reborn")
|
26 |
-
tokenizer = AutoTokenizer.from_pretrained("juanluisdb/MiniLM-L-6-rerank-
|
27 |
features = tokenizer(['How many people live in Berlin?', 'How many people live in Berlin?'], ['Berlin has a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers.', 'New York City is famous for the Metropolitan Museum of Art.'], padding=True, truncation=True, return_tensors="pt")
|
28 |
model.eval()
|
29 |
with torch.no_grad():
|
@@ -36,7 +36,7 @@ with torch.no_grad():
|
|
36 |
|
37 |
```python
|
38 |
from sentence_transformers import CrossEncoder
|
39 |
-
model = CrossEncoder("juanluisdb/MiniLM-L-6-rerank-
|
40 |
scores = model.predict([('Query', 'Paragraph1'), ('Query', 'Paragraph2') , ('Query', 'Paragraph3')])
|
41 |
```
|
42 |
|
@@ -45,7 +45,7 @@ scores = model.predict([('Query', 'Paragraph1'), ('Query', 'Paragraph2') , ('Que
|
|
45 |
### BEIR (NDCG@10)
|
46 |
I've run tests on different BEIR datasets. Cross Encoders rerank top100 BM25 results.
|
47 |
|
48 |
-
| | bm25 | jina-reranker-v1-turbo-en | bge-reranker-v2-m3 | mxbai-rerank-base-v1 | ms-marco-MiniLM-L-6-v2 | MiniLM-L-6-rerank-
|
49 |
|:---------------|-------:|----------------------------:|:---------------------|:-----------------------|-------------------------:|:------------------------------|
|
50 |
| nq* | 0.305 | 0.533 | **0.597** | 0.535 | 0.523 | 0.580 |
|
51 |
| fever* | 0.638 | 0.852 | 0.857 | 0.767 | 0.801 | **0.867** |
|
@@ -61,9 +61,9 @@ I've run tests on different BEIR datasets. Cross Encoders rerank top100 BM25 res
|
|
61 |
|
62 |
\* Training splits of NQ and Fever were used as part of the training data.
|
63 |
|
64 |
-
Comparison with [ablated model](https://huggingface.co/juanluisdb/MiniLM-L-6-rerank-
|
65 |
|
66 |
-
| | ms-marco-MiniLM-L-6-v2 | MiniLM-L-6-rerank-
|
67 |
|:---------------|-------------------------:|--------------------------------------:|
|
68 |
| nq | 0.5234 | **0.5412** |
|
69 |
| fever | 0.8007 | **0.8221** |
|
|
|
23 |
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
24 |
import torch
|
25 |
model = AutoModelForSequenceClassification.from_pretrained("juanluisdb/MiniLM-L-6-rerank-reborn")
|
26 |
+
tokenizer = AutoTokenizer.from_pretrained("juanluisdb/MiniLM-L-6-rerank-m3")
|
27 |
features = tokenizer(['How many people live in Berlin?', 'How many people live in Berlin?'], ['Berlin has a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers.', 'New York City is famous for the Metropolitan Museum of Art.'], padding=True, truncation=True, return_tensors="pt")
|
28 |
model.eval()
|
29 |
with torch.no_grad():
|
|
|
36 |
|
37 |
```python
|
38 |
from sentence_transformers import CrossEncoder
|
39 |
+
model = CrossEncoder("juanluisdb/MiniLM-L-6-rerank-m3", max_length=512)
|
40 |
scores = model.predict([('Query', 'Paragraph1'), ('Query', 'Paragraph2') , ('Query', 'Paragraph3')])
|
41 |
```
|
42 |
|
|
|
45 |
### BEIR (NDCG@10)
|
46 |
I've run tests on different BEIR datasets. Cross Encoders rerank top100 BM25 results.
|
47 |
|
48 |
+
| | bm25 | jina-reranker-v1-turbo-en | bge-reranker-v2-m3 | mxbai-rerank-base-v1 | ms-marco-MiniLM-L-6-v2 | MiniLM-L-6-rerank-m3 |
|
49 |
|:---------------|-------:|----------------------------:|:---------------------|:-----------------------|-------------------------:|:------------------------------|
|
50 |
| nq* | 0.305 | 0.533 | **0.597** | 0.535 | 0.523 | 0.580 |
|
51 |
| fever* | 0.638 | 0.852 | 0.857 | 0.767 | 0.801 | **0.867** |
|
|
|
61 |
|
62 |
\* Training splits of NQ and Fever were used as part of the training data.
|
63 |
|
64 |
+
Comparison with [ablated model](https://huggingface.co/juanluisdb/MiniLM-L-6-rerank-m3-ablated) trained only on MSMarco:
|
65 |
|
66 |
+
| | ms-marco-MiniLM-L-6-v2 | MiniLM-L-6-rerank-m3-ablated |
|
67 |
|:---------------|-------------------------:|--------------------------------------:|
|
68 |
| nq | 0.5234 | **0.5412** |
|
69 |
| fever | 0.8007 | **0.8221** |
|