Sheshera Mysore
commited on
Commit
•
a83a9b4
1
Parent(s):
c5d50b6
Language and small clarifications.
Browse files
README.md
CHANGED
@@ -35,7 +35,7 @@ The model was trained with the Adam Optimizer and a learning rate of 1e-5 with 1
|
|
35 |
|
36 |
### Intended uses & limitations
|
37 |
|
38 |
-
This model is trained for document similarity tasks in biomedical scientific text using a single vector per document. Here, the documents are the title and abstract of a paper. With appropriate fine-tuning the model can also be used for other tasks such as classification. Since the training data comes primarily from biomedicine, performance on other domains may be poorer.
|
39 |
|
40 |
### How to use
|
41 |
|
@@ -56,19 +56,19 @@ clsrep = result.last_hidden_state[:,0,:]
|
|
56 |
**`aspire-biencoder-biomed-scib-full`**, can be used as follows: 1) Download the [`aspire-biencoder-biomed-scib-full.zip`](https://drive.google.com/file/d/1MDCv9Fc33eP015HTWKi50WYXixh72h5c/view?usp=sharing), and 2) Use it per this example usage script: [`aspire/examples/ex_aspire_bienc.py`](https://github.com/allenai/aspire/blob/main/examples/ex_aspire_bienc.py)
|
57 |
|
58 |
### Variable and metrics
|
59 |
-
This model is evaluated on information retrieval datasets with document level queries. Here we report performance on RELISH, and TRECCOVID. These are detailed on [github](https://github.com/allenai/aspire) and in our [paper](https://arxiv.org/abs/2111.08366). These datasets represent a abstract level retrieval task, where given a query scientific abstract the task requires the retrieval of relevant candidate abstracts.
|
60 |
|
61 |
We rank documents by the L2 distance between the query and candidate documents.
|
62 |
|
63 |
### Evaluation results
|
64 |
|
65 |
-
The released model `aspire-biencoder-biomed-spec` (and `aspire-biencoder-biomed-spec-full`) is compared against `allenai/specter`. `aspire-biencoder-biomed-spec`<sup>*</sup> is the performance reported in our paper by averaging over 3 re-runs of the model. The released models `aspire-biencoder-biomed-spec` and `aspire-biencoder-biomed-spec-full` are the single best run among the 3 re-runs.
|
66 |
|
67 |
| | TRECCOVID | TRECCOVID | RELISH | RELISH |
|
68 |
|-------------------------------------------:|:---------:|:-------:|:------:|:-------:|
|
69 |
| | MAP | NDCG%20 | MAP | NDCG%20 |
|
70 |
| `specter` | 28.24 | 59.28 | 60.62| 77.20 |
|
71 |
-
| `aspire-biencoder-biomed-spec`<sup>*</sup> | 28.59 | 60.07 | 61.43| 77.96 |
|
72 |
| `aspire-biencoder-biomed-spec` | 26.07 | 54.89 | 61.47| 78.34 |
|
73 |
| `aspire-biencoder-biomed-spec-full` | 28.87 | 60.47 | 61.69| 78.22 |
|
74 |
|
|
|
35 |
|
36 |
### Intended uses & limitations
|
37 |
|
38 |
+
This model is trained for document similarity tasks in **biomedical** scientific text using a single vector per document. Here, the documents are the title and abstract of a paper. With appropriate fine-tuning the model can also be used for other tasks such as classification. Since the training data comes primarily from biomedicine, performance on other domains may be poorer.
|
39 |
|
40 |
### How to use
|
41 |
|
|
|
56 |
**`aspire-biencoder-biomed-scib-full`**, can be used as follows: 1) Download the [`aspire-biencoder-biomed-scib-full.zip`](https://drive.google.com/file/d/1MDCv9Fc33eP015HTWKi50WYXixh72h5c/view?usp=sharing), and 2) Use it per this example usage script: [`aspire/examples/ex_aspire_bienc.py`](https://github.com/allenai/aspire/blob/main/examples/ex_aspire_bienc.py)
|
57 |
|
58 |
### Variable and metrics
|
59 |
+
This model is evaluated on information retrieval datasets with document level queries. Here we report performance on RELISH (biomedical/English), and TRECCOVID (biomedical/English). These are detailed on [github](https://github.com/allenai/aspire) and in our [paper](https://arxiv.org/abs/2111.08366). These datasets represent a abstract level retrieval task, where given a query scientific abstract the task requires the retrieval of relevant candidate abstracts.
|
60 |
|
61 |
We rank documents by the L2 distance between the query and candidate documents.
|
62 |
|
63 |
### Evaluation results
|
64 |
|
65 |
+
The released model `aspire-biencoder-biomed-spec` (and `aspire-biencoder-biomed-spec-full`) is compared against `allenai/specter`. `aspire-biencoder-biomed-spec-full`<sup>*</sup> is the performance reported in our paper by averaging over 3 re-runs of the model. The released models `aspire-biencoder-biomed-spec` and `aspire-biencoder-biomed-spec-full` are the single best run among the 3 re-runs.
|
66 |
|
67 |
| | TRECCOVID | TRECCOVID | RELISH | RELISH |
|
68 |
|-------------------------------------------:|:---------:|:-------:|:------:|:-------:|
|
69 |
| | MAP | NDCG%20 | MAP | NDCG%20 |
|
70 |
| `specter` | 28.24 | 59.28 | 60.62| 77.20 |
|
71 |
+
| `aspire-biencoder-biomed-spec-full`<sup>*</sup> | 28.59 | 60.07 | 61.43| 77.96 |
|
72 |
| `aspire-biencoder-biomed-spec` | 26.07 | 54.89 | 61.47| 78.34 |
|
73 |
| `aspire-biencoder-biomed-spec-full` | 28.87 | 60.47 | 61.69| 78.22 |
|
74 |
|