synthsumm
Collection
generalist summarizers trained on curated long docs + synthetic LLM summaries
โข
5 items
โข
Updated
Fine-tuned on a synthetic dataset of curated long-context text and GPT-3.5-turbo-1106
summaries spanning multiple domains + "random" long-context examples from pretraining datasets
synthsumm
dataTry it: gradio demo | free HF inference api via requests
| .md with example outputs (gauntlet)
It's recommended to use this model with beam search decoding. If interested, you can also use the textsum
util repo to have most of this abstracted out for you:
pip install -U textsum
from textsum.summarize import Summarizer
model_name = "pszemraj/long-t5-tglobal-base-synthsumm_direct"
summarizer = Summarizer(model_name) # GPU auto-detected
text = "put the text you don't want to read here"
summary = summarizer.summarize_string(text)
print(summary)
This model is a fine-tuned version of google/long-t5-tglobal-base on the None dataset. It achieves the following results on the evaluation set:
The following hyperparameters were used during training:
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
1.9183 | 0.38 | 125 | 1.5762 | 38.7221 | 15.0873 | 28.3123 | 34.9655 | 129.2154 |
1.8815 | 0.77 | 250 | 1.5230 | 44.3531 | 17.9384 | 31.7417 | 39.5563 | 87.3538 |
1.7264 | 1.15 | 375 | 1.4735 | 45.7781 | 20.102 | 33.329 | 41.4737 | 101.9231 |
1.8545 | 1.54 | 500 | 1.4505 | 47.0134 | 20.6159 | 33.6118 | 41.6579 | 88.2308 |
1.7444 | 1.92 | 625 | 1.4378 | 48.0918 | 21.2531 | 34.4307 | 43.0271 | 84.5231 |
Base model
google/long-t5-tglobal-base