riccorl commited on
Commit
a114227
1 Parent(s): 774b629

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -28
README.md CHANGED
@@ -107,34 +107,43 @@ The retriever is responsible for retrieving relevant documents from a large coll
107
  while the reader is responsible for extracting entities and relations from the retrieved documents.
108
  ReLiK can be used with the `from_pretrained` method to load a pre-trained pipeline.
109
 
110
- Here is an example of how to use ReLiK for **Entity Linking**:
111
 
112
  ```python
113
  from relik import Relik
114
  from relik.inference.data.objects import RelikOutput
115
 
116
- relik = Relik.from_pretrained("sapienzanlp/relik-entity-linking-large")
117
  relik_out: RelikOutput = relik("Michael Jordan was one of the best players in the NBA.")
118
  ```
119
 
 
120
  RelikOutput(
121
- text="Michael Jordan was one of the best players in the NBA.",
122
- tokens=['Michael', 'Jordan', 'was', 'one', 'of', 'the', 'best', 'players', 'in', 'the', 'NBA', '.'],
123
- id=0,
124
  spans=[
125
- Span(start=0, end=14, label="Michael Jordan", text="Michael Jordan"),
126
- Span(start=50, end=53, label="National Basketball Association", text="NBA"),
127
- ],
128
- triples=[],
 
 
 
 
 
 
 
129
  candidates=Candidates(
130
- span=[
131
- [
132
  [
133
- {"text": "Michael Jordan", "id": 4484083},
134
- {"text": "National Basketball Association", "id": 5209815},
135
- {"text": "Walter Jordan", "id": 2340190},
136
- {"text": "Jordan", "id": 3486773},
137
- {"text": "50 Greatest Players in NBA History", "id": 1742909},
 
138
  ...
139
  ]
140
  ]
@@ -142,22 +151,18 @@ relik_out: RelikOutput = relik("Michael Jordan was one of the best players in th
142
  ),
143
  )
144
 
 
145
  ## 📊 Performance
146
 
147
- We evaluate the performance of ReLiK on Entity Linking using [GERBIL](http://gerbil-qa.aksw.org/gerbil/). The following table shows the results (InKB Micro F1) of ReLiK Large and Base:
148
 
149
- | Model | AIDA | MSNBC | Der | K50 | R128 | R500 | O15 | O16 | Tot | OOD | AIT (m:s) |
150
- |------------------------------------------|------|-------|------|------|------|------|------|------|------|------|------------|
151
- | GENRE | 83.7 | 73.7 | 54.1 | 60.7 | 46.7 | 40.3 | 56.1 | 50.0 | 58.2 | 54.5 | 38:00 |
152
- | EntQA | 85.8 | 72.1 | 52.9 | 64.5 | **54.1** | 41.9 | 61.1 | 51.3 | 60.5 | 56.4 | 20:00 |
153
- | [ReLiK<sub>Base<sub>](https://huggingface.co/sapienzanlp/relik-entity-linking-base) | 85.3 | 72.3 | 55.6 | 68.0 | 48.1 | 41.6 | 62.5 | 52.3 | 60.7 | 57.2 | 00:29 |
154
- | ➡️ [ReLiK<sub>Large<sub>](https://huggingface.co/sapienzanlp/relik-entity-linking-large) | **86.4** | **75.0** | **56.3** | **72.8** | 51.7 | **43.0** | **65.1** | **57.2** | **63.4** | **60.2** | 01:46 |
155
 
156
- Comparison systems' evaluation (InKB Micro F1) on the *in-domain* AIDA test set and *out-of-domain* MSNBC (MSN), Derczynski (Der), KORE50 (K50), N3-Reuters-128 (R128),
157
- N3-RSS-500 (R500), OKE-15 (O15), and OKE-16 (O16) test sets. **Bold** indicates the best model.
158
- GENRE uses mention dictionaries.
159
- The AIT column shows the time in minutes and seconds (m:s) that the systems need to process the whole AIDA test set using an NVIDIA RTX 4090,
160
- except for EntQA which does not fit in 24GB of RAM and for which an A100 is used.
161
 
162
  ## 🤖 Models
163
 
 
107
  while the reader is responsible for extracting entities and relations from the retrieved documents.
108
  ReLiK can be used with the `from_pretrained` method to load a pre-trained pipeline.
109
 
110
+ Here is an example of how to use ReLiK for **Relation Extraction**:
111
 
112
  ```python
113
  from relik import Relik
114
  from relik.inference.data.objects import RelikOutput
115
 
116
+ relik = Relik.from_pretrained("sapienzanlp/relik-relation-extraction-nyt-large")
117
  relik_out: RelikOutput = relik("Michael Jordan was one of the best players in the NBA.")
118
  ```
119
 
120
+
121
  RelikOutput(
122
+ text='Michael Jordan was one of the best players in the NBA.',
123
+ tokens=Michael Jordan was one of the best players in the NBA.,
124
+ id=0,
125
  spans=[
126
+ Span(start=0, end=14, label='--NME--', text='Michael Jordan'),
127
+ Span(start=50, end=53, label='--NME--', text='NBA')
128
+ ],
129
+ triplets=[
130
+ Triplets(
131
+ subject=Span(start=0, end=14, label='--NME--', text='Michael Jordan'),
132
+ label='company',
133
+ object=Span(start=50, end=53, label='--NME--', text='NBA'),
134
+ confidence=1.0
135
+ )
136
+ ],
137
  candidates=Candidates(
138
+ span=[],
139
+ triplet=[
140
  [
141
+ [
142
+ {"text": "company", "id": 4, "metadata": {"definition": "company of this person"}},
143
+ {"text": "nationality", "id": 10, "metadata": {"definition": "nationality of this person or entity"}},
144
+ {"text": "child", "id": 17, "metadata": {"definition": "child of this person"}},
145
+ {"text": "founded by", "id": 0, "metadata": {"definition": "founder or co-founder of this organization, religion or place"}},
146
+ {"text": "residence", "id": 18, "metadata": {"definition": "place where this person has lived"}},
147
  ...
148
  ]
149
  ]
 
151
  ),
152
  )
153
 
154
+
155
  ## 📊 Performance
156
 
157
+ The following table shows the results (Micro F1) of ReLiK Large on the NYT dataset:
158
 
159
+ | Model | NYT | NYT (Pretr) | AIT (m:s) |
160
+ |------------------------------------------|------|-------|------------|
161
+ | REBEL | 93.1 | 93.4 | 01:45 |
162
+ | UiE | 93.5 | -- | -- |
163
+ | USM | 94.0 | 94.1 | -- |
164
+ | ➡️ [ReLiK<sub>Large<sub>](https://huggingface.co/sapienzanlp/relik-relation-extraction-nyt-large) | **95.0** | **94.9** | 00:30 |
165
 
 
 
 
 
 
166
 
167
  ## 🤖 Models
168