sapienzanlp
/

relik-retriever-small-nyt-question-encoder

@@ -107,34 +107,43 @@ The retriever is responsible for retrieving relevant documents from a large coll
 while the reader is responsible for extracting entities and relations from the retrieved documents.
 ReLiK can be used with the `from_pretrained` method to load a pre-trained pipeline.
-Here is an example of how to use ReLiK for **Entity Linking**:
 ```python
 from relik import Relik
 from relik.inference.data.objects import RelikOutput
-relik = Relik.from_pretrained("sapienzanlp/relik-entity-linking-large")
 relik_out: RelikOutput = relik("Michael Jordan was one of the best players in the NBA.")
 ```
     RelikOutput(
-      text="Michael Jordan was one of the best players in the NBA.",
-      tokens=['Michael', 'Jordan', 'was', 'one', 'of', 'the', 'best', 'players', 'in', 'the', 'NBA', '.'],
-      id=0,
       spans=[
-          Span(start=0, end=14, label="Michael Jordan", text="Michael Jordan"),
-          Span(start=50, end=53, label="National Basketball Association", text="NBA"),
-      ],
-      triples=[],
       candidates=Candidates(
-          span=[
-              [
                   [
-                      {"text": "Michael Jordan", "id": 4484083},
-                      {"text": "National Basketball Association", "id": 5209815},
-                      {"text": "Walter Jordan", "id": 2340190},
-                      {"text": "Jordan", "id": 3486773},
-                      {"text": "50 Greatest Players in NBA History", "id": 1742909},
                       ...
                   ]
               ]
@@ -142,22 +151,18 @@ relik_out: RelikOutput = relik("Michael Jordan was one of the best players in th
       ),
     )
 ## 📊 Performance
-We evaluate the performance of ReLiK on Entity Linking using [GERBIL](http://gerbil-qa.aksw.org/gerbil/). The following table shows the results (InKB Micro F1) of ReLiK Large and Base:
-| Model                                    | AIDA | MSNBC | Der  | K50  | R128 | R500 | O15  | O16  | Tot  | OOD  | AIT (m:s) |
-|------------------------------------------|------|-------|------|------|------|------|------|------|------|------|------------|
-| GENRE                                    | 83.7 | 73.7  | 54.1 | 60.7 | 46.7 | 40.3 | 56.1 | 50.0 | 58.2 | 54.5 | 38:00      |
-| EntQA                                    | 85.8 | 72.1  | 52.9 | 64.5 | **54.1** | 41.9 | 61.1 | 51.3 | 60.5 | 56.4 | 20:00      |
-| [ReLiK<sub>Base<sub>](https://huggingface.co/sapienzanlp/relik-entity-linking-base)                      | 85.3 | 72.3  | 55.6 | 68.0 | 48.1 | 41.6 | 62.5 | 52.3 | 60.7 | 57.2 | 00:29      |
-| ➡️ [ReLiK<sub>Large<sub>](https://huggingface.co/sapienzanlp/relik-entity-linking-large)                     | **86.4** | **75.0**  | **56.3** | **72.8** | 51.7 | **43.0** | **65.1** | **57.2** | **63.4** | **60.2** | 01:46      |
-Comparison systems' evaluation (InKB Micro F1) on the *in-domain* AIDA test set and *out-of-domain* MSNBC (MSN), Derczynski (Der), KORE50 (K50), N3-Reuters-128 (R128),
-N3-RSS-500 (R500), OKE-15 (O15), and OKE-16 (O16) test sets. **Bold** indicates the best model.
-GENRE uses mention dictionaries.
-The AIT column shows the time in minutes and seconds (m:s) that the systems need to process the whole AIDA test set using an NVIDIA RTX 4090,
-except for EntQA which does not fit in 24GB of RAM and for which an A100 is used.
 ## 🤖 Models

 while the reader is responsible for extracting entities and relations from the retrieved documents.
 ReLiK can be used with the `from_pretrained` method to load a pre-trained pipeline.
+Here is an example of how to use ReLiK for **Relation Extraction**:
 ```python
 from relik import Relik
 from relik.inference.data.objects import RelikOutput
+relik = Relik.from_pretrained("sapienzanlp/relik-relation-extraction-nyt-large")
 relik_out: RelikOutput = relik("Michael Jordan was one of the best players in the NBA.")
 ```
     RelikOutput(
+      text='Michael Jordan was one of the best players in the NBA.',
+      tokens=Michael Jordan was one of the best players in the NBA.,
+      id=0,
       spans=[
+        Span(start=0, end=14, label='--NME--', text='Michael Jordan'),
+        Span(start=50, end=53, label='--NME--', text='NBA')
+      ],
+      triplets=[
+        Triplets(
+          subject=Span(start=0, end=14, label='--NME--', text='Michael Jordan'),
+          label='company',
+          object=Span(start=50, end=53, label='--NME--', text='NBA'),
+          confidence=1.0
+          )
+      ],
       candidates=Candidates(
+        span=[],
+        triplet=[
                   [
+                    [
+                      {"text": "company", "id": 4, "metadata": {"definition": "company of this person"}},
+                      {"text": "nationality", "id": 10, "metadata": {"definition": "nationality of this person or entity"}},
+                      {"text": "child", "id": 17, "metadata": {"definition": "child of this person"}},
+                      {"text": "founded by", "id": 0, "metadata": {"definition": "founder or co-founder of this organization, religion or place"}},
+                      {"text": "residence", "id": 18, "metadata": {"definition": "place where this person has lived"}},
                       ...
                   ]
               ]
       ),
     )
 ## 📊 Performance
+The following table shows the results (Micro F1) of ReLiK Large on the NYT dataset:
+| Model                                    | NYT | NYT (Pretr) | AIT (m:s) |
+|------------------------------------------|------|-------|------------|
+| REBEL                                    | 93.1 | 93.4  | 01:45      |
+| UiE                                      | 93.5 | --    | --      |
+| USM                                      | 94.0 | 94.1  | --      |
+| ➡️ [ReLiK<sub>Large<sub>](https://huggingface.co/sapienzanlp/relik-relation-extraction-nyt-large) | **95.0** | **94.9**  | 00:30      |
 ## 🤖 Models