learn3r's picture
End of training
99c281a
metadata
base_model: >-
  /exports/eddie/scratch/s1970716/models/summarization/longt5_xl_gov_memsum_bp_15/checkpoint-1360
tags:
  - generated_from_trainer
datasets:
  - learn3r/gov_report_memsum_bp
metrics:
  - rouge
model-index:
  - name: longt5_xl_gov_memsum_bp_20
    results:
      - task:
          name: Summarization
          type: summarization
        dataset:
          name: learn3r/gov_report_memsum_bp
          type: learn3r/gov_report_memsum_bp
        metrics:
          - name: Rouge1
            type: rouge
            value: 42.5601

longt5_xl_gov_memsum_bp_20

This model is a fine-tuned version of /exports/eddie/scratch/s1970716/models/summarization/longt5_xl_gov_memsum_bp_15/checkpoint-1360 on the learn3r/gov_report_memsum_bp dataset. It achieves the following results on the evaluation set:

  • Loss: 3.6259
  • Rouge1: 42.5601
  • Rouge2: 14.1791
  • Rougel: 17.9691
  • Rougelsum: 40.487
  • Gen Len: 1510.8695

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 2
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 64
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant
  • num_epochs: 5.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
0.1591 1.0 136 3.6259 42.5601 14.1791 17.9691 40.487 1510.8695
0.1186 1.99 272 3.7885 39.795 13.2493 17.3095 37.9065 1707.0401
0.097 3.0 409 4.0192 41.4441 13.4026 17.8804 39.4502 1442.7729
0.0818 4.0 545 4.1699 40.2374 13.5869 17.364 38.2969 1741.0236
0.0786 4.99 680 4.3339 39.5612 13.4283 17.3666 37.6526 1710.4111

Framework versions

  • Transformers 4.34.1
  • Pytorch 2.1.0+cu121
  • Datasets 2.14.5
  • Tokenizers 0.14.1