--- library_name: transformers language: - en license: - cc-by-nc-sa-4.0 --- # Model Card: T5-base-summarization-claim-extractor ## Model Description **Model Name:** T5-base-summarization-claim-extractor **Authors:** Alessandro Scirè, Karim Ghonim, and Roberto Navigli **Language:** English **Primary Use:** Extraction of atomic claims from a summary ### Overview The T5-base-summarization-claim-extractor is a model developed for the task of extracting atomic claims from summaries. The model is based on the T5 architecture which is then fine-tuned specifically for claim extraction. This model was introduced as part of the research presented in the paper ["FENICE: Factuality Evaluation of summarization based on Natural Language Inference and Claim Extraction" by Alessandro Scirè, Karim Ghonim, and Roberto Navigli.](https://aclanthology.org/2024.findings-acl.841.pdf) FENICE leverages Natural Language Inference (NLI) and Claim Extraction to evaluate the factuality of summaries. ### Intended Use This model is designed to: - Extract atomic claims from summaries. - Serve as a component in pipelines for factuality evaluation of summaries. ## Example Code You can use the following code to perform operations such as getting distinct elements from a list or splitting text into sentences. ```python from transformers import T5ForConditionalGeneration, T5Tokenizer tokenizer = T5Tokenizer.from_pretrained("Babelscape/t5-base-summarization-claim-extractor") model = T5ForConditionalGeneration.from_pretrained("Babelscape/t5-base-summarization-claim-extractor").to("cuda:0") device = "cuda:0" summary = 'Simone Biles made a triumphant return to the Olympic stage at the Paris 2024 Games, competing in the women’s gymnastics qualifications. Overcoming a previous struggle with the “twisties” that led to her withdrawal from events at the Tokyo 2020 Olympics, Biles dazzled with strong performances on all apparatus, helping the U.S. team secure a commanding lead in the qualifications. Her routines showcased her resilience and skill, drawing enthusiastic support from a star-studded audience' tok_input = tokenizer.batch_encode_plus([summary], return_tensors="pt", padding=True).to(device) claims = model.generate(**tok_input) claims = tokenizer.batch_decode(claims, skip_special_tokens=True) ``` ### Training For details regarding the training process, please checkout our [paper](https://aclanthology.org/2024.findings-acl.841.pdf) (section 4.1). ### Performance |