metadata

tags:
  - generated_from_trainer
datasets:
  - cnn_dailymail
  - xsum
  - samsum
  - billsum
  - lytang/MeetingBank-transcript
metrics:
  - rouge
model-index:
  - name: t5_xsum_samsum_billsum_cnn_dailymail
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: cnn_dailymail
          type: cnn_dailymail
          config: 3.0.0
          split: train
          args: 3.0.0
        metrics:
          - name: Rouge1
            type: rouge
            value: 0.2373
license: mit
language:
  - en
library_name: transformers
pipeline_tag: summarization

t5_xsum_samsum_billsum_cnn_dailymail

The t5_xsum_samsum_billsum_cnn_dailymail model is a text summarization model fine-tuned on the t5-base architecture, which is a versatile text-to-text transfer transformer. This powerful model excels at generating abstractive summaries from input text. It has been fine-tuned on multiple datasets, including CNN/Daily Mail (cnn_dailymail), XSum (xsum), SamSum (samsum), BillSum (billsum), and the MeetingBank-transcript dataset by lytang.

Intended Uses & Limitations

Intended Uses

Document summarization: The model is well-suited for summarizing lengthy documents or articles, making it valuable for content curation and information extraction tasks.
Content generation: It can be used to generate concise summaries from input text, which is useful for creating short and informative snippets.

Limitations

Model size: The model's size may require significant computational resources for deployment, limiting its use in resource-constrained environments.
Domain-specific content: While it performs well on general text summarization tasks, its performance may vary when applied to domain-specific content.

Training and Evaluation Data

The model has been trained on a diverse set of datasets, including CNN/Daily Mail, XSum, SamSum, BillSum, and the MeetingBank-transcript dataset. These datasets provide a wide range of text summarization examples, enabling the model to generalize across various domains and styles of text.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 2
eval_batch_size: 2
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 1

Training results

samsum

Rouge1	Rouge2	RougeL	RougeLsum
0.0138	0.0002	0.0138	0.0138

CNN_Dailymail

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
1.8486	1.0	32300	1.6478	0.2373	0.1086	0.1972	0.1971	18.9674

Framework versions

Transformers 4.33.0
Pytorch 2.0.0
Datasets 2.1.0
Tokenizers 0.13.3