BART fine-tuned for keyphrase generation
This is the bart-base (Lewis et al.. 2019) model finetuned for the keyphrase generation task (Glazkova & Morozov, 2023) on the fragments of the following corpora:
- Krapivin (Krapivin et al., 2009)
- Inspec (Hulth, 2003)
- KPTimes (Gallina, 2019)
- DUC-2001 (Wan, 2008)
- PubMed (Schutz, 2008)
- NamedKeys (Gero & Ho, 2019).
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("aglazkova/bart_finetuned_keyphrase_extraction")
model = AutoModelForSeq2SeqLM.from_pretrained("aglazkova/bart_finetuned_keyphrase_extraction")
text = "In this paper, we investigate cross-domain limitations of keyphrase generation using the models for abstractive text summarization.\
We present an evaluation of BART fine-tuned for keyphrase generation across three types of texts, \
namely scientific texts from computer science and biomedical domains and news texts. \
We explore the role of transfer learning between different domains to improve the model performance on small text corpora."
tokenized_text = tokenizer.prepare_seq2seq_batch([text], return_tensors='pt')
translation = model.generate(**tokenized_text)
translated_text = tokenizer.batch_decode(translation, skip_special_tokens=True)[0]
print(translated_text)
Training Hyperparameters
The following hyperparameters were used during training:
- learning_rate: 4e-5
- train_batch_size: 8
- optimizer: AdamW with betas=(0.9,0.999) and epsilon=1e-08
- num_epochs: 6
BibTeX:
@InProceedings{10.1007/978-3-031-67826-4_19,
author="Glazkova, Anna
and Morozov, Dmitry",
title="Cross-Domain Robustness of Transformer-Based Keyphrase Generation",
booktitle="Data Analytics and Management in Data Intensive Domains",
year="2024",
publisher="Springer Nature Switzerland",
address="Cham",
pages="249--265"
}
- Downloads last month
- 280,578