bart_large_gov / README.md
learn3r's picture
End of training
e58e5db verified
metadata
license: apache-2.0
base_model: facebook/bart-large
tags:
  - generated_from_trainer
datasets:
  - learn3r/gov_report_memsum_oracle
metrics:
  - rouge
model-index:
  - name: bart_large_gov
    results:
      - task:
          name: Summarization
          type: summarization
        dataset:
          name: learn3r/gov_report_memsum_oracle
          type: learn3r/gov_report_memsum_oracle
        metrics:
          - name: Rouge1
            type: rouge
            value: 71.9948

bart_large_gov

This model is a fine-tuned version of facebook/bart-large on the learn3r/gov_report_memsum_oracle dataset. It achieves the following results on the evaluation set:

  • Loss: 1.4266
  • Rouge1: 71.9948
  • Rouge2: 41.0084
  • Rougel: 38.0938
  • Rougelsum: 69.4488
  • Gen Len: 751.0288

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.7352 1.0 136 1.5224 72.0472 41.3267 36.4817 69.4011 685.9300
1.6874 1.99 272 1.4779 71.7737 40.8546 36.8472 69.2034 699.4866
1.5695 3.0 409 1.4583 72.2243 41.372 37.8382 69.6295 695.0977
1.4951 3.99 545 1.4495 71.5808 40.5556 37.152 69.0536 753.5967
1.496 5.0 682 1.4386 72.1271 41.1645 38.4096 69.6176 700.2160
1.4258 6.0 818 1.4374 71.9975 41.0013 37.9947 69.449 743.7068
1.4301 7.0 955 1.4296 71.8896 40.8303 38.346 69.357 724.5062
1.4015 8.0 1091 1.4313 72.0031 40.9229 38.2581 69.4154 731.2685
1.391 8.99 1227 1.4266 71.9948 41.0084 38.0938 69.4488 751.0288
1.3642 10.0 1364 1.4287 71.9115 40.8683 38.1602 69.3514 756.9568
1.3516 10.99 1500 1.4289 72.3822 41.5074 38.8088 69.8232 719.2798
1.3243 12.0 1637 1.4301 71.83 40.764 38.1124 69.2767 749.9475
1.3582 12.99 1773 1.4283 71.9495 40.9556 38.4201 69.4394 736.6698
1.3149 14.0 1910 1.4298 71.9599 40.8875 38.2722 69.4209 753.3230
1.288 15.0 2046 1.4326 72.1615 41.1549 38.611 69.5977 744.8858
1.2937 16.0 2183 1.4315 71.9783 40.9073 38.4263 69.4109 755.5340
1.258 17.0 2319 1.4328 72.0298 40.931 38.4845 69.4823 734.6399
1.2617 17.99 2455 1.4336 71.9488 40.8816 38.4521 69.4151 744.7068
1.2864 19.0 2592 1.4346 72.1334 40.9965 38.5682 69.5666 744.2449
1.2936 19.94 2720 1.4351 72.0397 40.9431 38.4161 69.5028 744.4588

Framework versions

  • Transformers 4.37.0.dev0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.5
  • Tokenizers 0.15.0