riccorl's picture
Update README.md
eecd428 verified
---
language:
- en
---
<div align="center">
<img src="https://github.com/SapienzaNLP/relik/blob/main/relik.png?raw=true" height="150">
<img src="https://github.com/SapienzaNLP/relik/blob/main/Sapienza_Babelscape.png?raw=true" height="50">
</div>
<div align="center">
<h1>Retrieve, Read and LinK: Fast and Accurate Entity Linking and Relation Extraction on an Academic Budget</h1>
</div>
<div style="display:flex; justify-content: center; align-items: center; flex-direction: row;">
<a href="https://2024.aclweb.org/"><img src="http://img.shields.io/badge/ACL-2024-4b44ce.svg"></a> &nbsp; &nbsp;
<a href="https://aclanthology.org/"><img src="http://img.shields.io/badge/paper-ACL--anthology-B31B1B.svg"></a> &nbsp; &nbsp;
<a href="https://arxiv.org/abs/2408.00103"><img src="https://img.shields.io/badge/arXiv-2408.00103-b31b1b.svg"></a>
</div>
<div style="display:flex; justify-content: center; align-items: center; flex-direction: row;">
<a href="https://huggingface.co/collections/sapienzanlp/relik-retrieve-read-and-link-665d9e4a5c3ecba98c1bef19"><img src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Collection-FCD21D"></a> &nbsp; &nbsp;
<a href="https://github.com/SapienzaNLP/relik"><img src="https://img.shields.io/badge/GitHub-Repo-121013?logo=github&logoColor=white"></a> &nbsp; &nbsp;
<a href="https://github.com/SapienzaNLP/relik/releases"><img src="https://img.shields.io/github/v/release/SapienzaNLP/relik"></a>
</div>
A blazing fast and lightweight Information Extraction model for **Entity Linking** and **Relation Extraction**.
**This repository contains the weights for the ReLiK Retriever component fine-tuned on NYT dataset.**
## πŸ› οΈ Installation
Installation from PyPI
```bash
pip install relik
```
<details>
<summary>Other installation options</summary>
#### Install with optional dependencies
Install with all the optional dependencies.
```bash
pip install relik[all]
```
Install with optional dependencies for training and evaluation.
```bash
pip install relik[train]
```
Install with optional dependencies for [FAISS](https://github.com/facebookresearch/faiss)
FAISS PyPI package is only available for CPU. For GPU, install it from source or use the conda package.
For CPU:
```bash
pip install relik[faiss]
```
For GPU:
```bash
conda create -n relik python=3.10
conda activate relik
# install pytorch
conda install -y pytorch=2.1.0 pytorch-cuda=12.1 -c pytorch -c nvidia
# GPU
conda install -y -c pytorch -c nvidia faiss-gpu=1.8.0
# or GPU with NVIDIA RAFT
conda install -y -c pytorch -c nvidia -c rapidsai -c conda-forge faiss-gpu-raft=1.8.0
pip install relik
```
Install with optional dependencies for serving the models with
[FastAPI](https://fastapi.tiangolo.com/) and [Ray](https://docs.ray.io/en/latest/serve/quickstart.html).
```bash
pip install relik[serve]
```
#### Installation from source
```bash
git clone https://github.com/SapienzaNLP/relik.git
cd relik
pip install -e .[all]
```
</details>
## πŸš€ Quick Start
[//]: # (Write a short description of the model and how to use it with the `from_pretrained` method.)
ReLiK is a lightweight and fast model for **Entity Linking** and **Relation Extraction**.
It is composed of two main components: a retriever and a reader.
The retriever is responsible for retrieving relevant documents from a large collection,
while the reader is responsible for extracting entities and relations from the retrieved documents.
ReLiK can be used with the `from_pretrained` method to load a pre-trained pipeline.
Here is an example of how to use ReLiK for **Relation Extraction**:
```python
from relik import Relik
from relik.inference.data.objects import RelikOutput
relik = Relik.from_pretrained("sapienzanlp/relik-relation-extraction-nyt-large")
relik_out: RelikOutput = relik("Michael Jordan was one of the best players in the NBA.")
```
RelikOutput(
text='Michael Jordan was one of the best players in the NBA.',
tokens=Michael Jordan was one of the best players in the NBA.,
id=0,
spans=[
Span(start=0, end=14, label='--NME--', text='Michael Jordan'),
Span(start=50, end=53, label='--NME--', text='NBA')
],
triplets=[
Triplets(
subject=Span(start=0, end=14, label='--NME--', text='Michael Jordan'),
label='company',
object=Span(start=50, end=53, label='--NME--', text='NBA'),
confidence=1.0
)
],
candidates=Candidates(
span=[],
triplet=[
[
[
{"text": "company", "id": 4, "metadata": {"definition": "company of this person"}},
{"text": "nationality", "id": 10, "metadata": {"definition": "nationality of this person or entity"}},
{"text": "child", "id": 17, "metadata": {"definition": "child of this person"}},
{"text": "founded by", "id": 0, "metadata": {"definition": "founder or co-founder of this organization, religion or place"}},
{"text": "residence", "id": 18, "metadata": {"definition": "place where this person has lived"}},
...
]
]
]
),
)
## πŸ“Š Performance
The following table shows the results (Micro F1) of ReLiK Large on the NYT dataset:
| Model | NYT | NYT (Pretr) | AIT (m:s) |
|------------------------------------------|------|-------|------------|
| REBEL | 93.1 | 93.4 | 01:45 |
| UiE | 93.5 | -- | -- |
| USM | 94.0 | 94.1 | -- |
| ➑️ [ReLiK<sub>Large<sub>](https://huggingface.co/sapienzanlp/relik-relation-extraction-nyt-large) | **95.0** | **94.9** | 00:30 |
## πŸ€– Models
Models can be found on [πŸ€— Hugging Face](https://huggingface.co/collections/sapienzanlp/relik-retrieve-read-and-link-665d9e4a5c3ecba98c1bef19).
## πŸ’½ Cite this work
If you use any part of this work, please consider citing the paper as follows:
```bibtex
@inproceedings{orlando-etal-2024-relik,
title = "Retrieve, Read and LinK: Fast and Accurate Entity Linking and Relation Extraction on an Academic Budget",
author = "Orlando, Riccardo and Huguet Cabot, Pere-Llu{\'\i}s and Barba, Edoardo and Navigli, Roberto",
booktitle = "Findings of the Association for Computational Linguistics: ACL 2024",
month = aug,
year = "2024",
address = "Bangkok, Thailand",
publisher = "Association for Computational Linguistics",
}
```