Edit model card
YAML Metadata Error: "base_model" with value "/exports/eddie/scratch/s1970716/models/summarization/longt5_xl_gov_memsum_bp_15/checkpoint-1360" is not valid. Use a model id from https://hf.co/models.

longt5_xl_gov_memsum_bp_20

This model is a fine-tuned version of /exports/eddie/scratch/s1970716/models/summarization/longt5_xl_gov_memsum_bp_15/checkpoint-1360 on the learn3r/gov_report_memsum_bp dataset. It achieves the following results on the evaluation set:

  • Loss: 3.6259
  • Rouge1: 42.5601
  • Rouge2: 14.1791
  • Rougel: 17.9691
  • Rougelsum: 40.487
  • Gen Len: 1510.8695

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 2
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 64
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant
  • num_epochs: 5.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
0.1591 1.0 136 3.6259 42.5601 14.1791 17.9691 40.487 1510.8695
0.1186 1.99 272 3.7885 39.795 13.2493 17.3095 37.9065 1707.0401
0.097 3.0 409 4.0192 41.4441 13.4026 17.8804 39.4502 1442.7729
0.0818 4.0 545 4.1699 40.2374 13.5869 17.364 38.2969 1741.0236
0.0786 4.99 680 4.3339 39.5612 13.4283 17.3666 37.6526 1710.4111

Framework versions

  • Transformers 4.34.1
  • Pytorch 2.1.0+cu121
  • Datasets 2.14.5
  • Tokenizers 0.14.1
Downloads last month
0
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train learn3r/longt5_xl_gov_memsum_bp_20

Evaluation results