metadata
license: apache-2.0
datasets:
- squad_v2
library_name: sentence-transformers
pipeline_tag: sentence-similarity
This similarity model was trained for 2 epochs based on the sentence-transformers/all-mpnet-base-v2. We used the SQuAD 2.0 dataset to compare similarity between questions and sentences containing the answer to the question. We employed the MultipleNegativesRankingLoss as the objective function. To generate negative examples, our strategy involved using BM25 to retrieve similar examples from all sentences in the dataset that did not contain the answer.
As a result, we improved the retrieval of correct sentences (containing the answer) for a question, as measured by Mean Reciprocal Rank (MRR@10) metric. We tested on 7 different SQuAD like datasets.
Model | Drop | BioASQ | Hotpot | News | Textbook | NQ | Trivia |
---|---|---|---|---|---|---|---|
Baseline Dense | 0.37 | 0.12 | 0.28 | 0.29 | 0.17 | 0.29 | 0.47 |
Baseline BM25 | 0.36 | 0.15 | 0.34 | 0.33 | 0.18 | 0.18 | 0.47 |
Azure Hybrid Semantic | 0.35 | 0.18 | 0.29 | ||||
This model | 0.39 | 0.16 | 0.34 | 0.36 | 0.19 | 0.28 | 0.49 |