--- tags: - feature-extraction - endpoints-template license: bsd-3-clause library_name: generic --- # Coreference Resolution for Long Documents Modified coreference resolution model from [BERT for Coreference Resolution: Baselines and Analysis](https://aclanthology.org/D19-1588/) for handling long documents (~40K words) efficiently (500K words/s on a NVIDIA Tesla V100). This modified model was used in [DAPR: A Benchmark on Document-Aware Passage Retrieval](https://arxiv.org/abs/2305.13915). ## Usage ### API call One can call the Hugging's Inference Endpoints API directly: ```python import requests import time API_URL = "https://api-inference.huggingface.co/models/kwang2049/long-coref" headers = {"Authorization": "Bearer ${YOUR_HUGGINGFACE_ACCESS_TOKEN}"} def query(payload): while True: response = requests.post(API_URL, headers=headers, json=payload) if response.status_code == 503: time.sleep(5) print(response.json()["error"]) continue elif response.status_code == 200: return response.json() else: error_message = f"{response.status_code}: {response.json['error']}." raise requests.HTTPError(error_message) doc = [ "The Half Moon is a public house and music venue in Putney, London. It is one of the city's longest running live music venues, and has hosted live music every night since 1963.", "The pub is on the south side of the Lower Richmond road, in the London Borough of Wandsworth." ] PARAGRAPH_DELIMITER = "\n\n" output = query( { "inputs": PARAGRAPH_DELIMITER.join(doc), } ) print(output) # { # 'pargraph_sentences': ..., # 'top_spans': ..., # 'antecedents': ... # } ``` ### Local run One can also run the code of the repo on a local machine: ```bash # Clone the repo git lfs install git clone https://huggingface.co/kwang2049/long-coref cd long-coref pip install -r requirements.txt python local_run.py ``` ## Citation If you use the repo, feel free to cite our publication [DAPR: A Benchmark on Document-Aware Passage Retrieval](https://arxiv.org/abs/2305.13915): ```bibtex @article{wang2023dapr, title = "DAPR: A Benchmark on Document-Aware Passage Retrieval", author = "Kexin Wang and Nils Reimers and Iryna Gurevych", journal= "arXiv preprint arXiv:2305.13915", year = "2023", url = "https://arxiv.org/abs/2305.13915", } ```