booksum link
Browse files
README.md
CHANGED
@@ -65,7 +65,7 @@ parameters:
|
|
65 |
|
66 |
---
|
67 |
|
68 |
-
# long-t5-tglobal-base-16384
|
69 |
|
70 |
- summarize long text and get a SparkNotes-esque summary of arbitrary topics!
|
71 |
- generalizes reasonably well to academic & narrative text.
|
@@ -116,7 +116,7 @@ Pass [other parameters related to beam search textgen](https://huggingface.co/bl
|
|
116 |
|
117 |
## Training and evaluation data
|
118 |
|
119 |
-
`kmfoda/booksum` dataset. Summaries longer than 1024 LongT5 tokens were filtered out with the intent of preventing the model from learning to generate "partial" summaries.
|
120 |
|
121 |
> - early checkpoints of this model were trained on a "smaller" subsection of the dataset as it was filtered for summaries of **1024 characters**. This was subsequently caught and adjusted to **1024 tokens** and then trained further for at least five epochs.
|
122 |
|
|
|
65 |
|
66 |
---
|
67 |
|
68 |
+
# long-t5-tglobal-base-16384 + BookSum
|
69 |
|
70 |
- summarize long text and get a SparkNotes-esque summary of arbitrary topics!
|
71 |
- generalizes reasonably well to academic & narrative text.
|
|
|
116 |
|
117 |
## Training and evaluation data
|
118 |
|
119 |
+
`kmfoda/booksum` dataset on HuggingFace - read [the original paper here](https://arxiv.org/abs/2105.08209). Summaries longer than 1024 LongT5 tokens were filtered out with the intent of preventing the model from learning to generate "partial" summaries.
|
120 |
|
121 |
> - early checkpoints of this model were trained on a "smaller" subsection of the dataset as it was filtered for summaries of **1024 characters**. This was subsequently caught and adjusted to **1024 tokens** and then trained further for at least five epochs.
|
122 |
|