bart_large_gov / README.md
learn3r's picture
End of training
d69036c verified
|
raw
history blame
3.91 kB
metadata
license: apache-2.0
base_model: facebook/bart-large
tags:
  - generated_from_trainer
datasets:
  - learn3r/gov_report_memsum_oracle
metrics:
  - rouge
model-index:
  - name: bart_large_gov
    results:
      - task:
          name: Summarization
          type: summarization
        dataset:
          name: learn3r/gov_report_memsum_oracle
          type: learn3r/gov_report_memsum_oracle
        metrics:
          - name: Rouge1
            type: rouge
            value: 56.2783

bart_large_gov

This model is a fine-tuned version of facebook/bart-large on the learn3r/gov_report_memsum_oracle dataset. It achieves the following results on the evaluation set:

  • Loss: 1.4450
  • Rouge1: 56.2783
  • Rouge2: 31.1387
  • Rougel: 39.2121
  • Rougelsum: 51.8068
  • Gen Len: 128.5062

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.6694 1.0 136 1.5338 54.2061 29.3577 37.2911 49.8337 139.7253
1.5178 1.99 272 1.4698 55.6621 30.6254 38.7491 51.2934 128.9475
1.4208 3.0 409 1.4487 55.4905 30.4201 38.412 51.1108 129.5658
1.3399 3.99 545 1.4450 56.2783 31.1387 39.2121 51.8068 128.5062
1.2326 5.0 682 1.4478 56.0182 30.7104 38.8337 51.6162 129.1358
1.1784 6.0 818 1.4533 56.4333 31.4483 39.5546 52.1347 128.7315
1.1739 7.0 955 1.4607 56.3636 31.1125 39.4055 51.9709 128.8241
1.1585 8.0 1091 1.4774 55.9356 30.7012 38.7824 51.5664 128.9640
1.0297 8.99 1227 1.4939 56.7487 31.552 39.6461 52.411 128.6553
1.0085 10.0 1364 1.5075 56.3918 31.2201 39.4213 51.9449 128.6265
0.9738 10.99 1500 1.5237 56.3041 30.9239 39.2625 51.8217 128.8282
0.9583 12.0 1637 1.5444 55.6539 30.2395 38.5901 51.1518 128.9136
0.9601 12.99 1773 1.5516 55.9154 30.5471 38.8607 51.2856 128.9784
0.8882 14.0 1910 1.5736 56.3282 30.9807 39.2351 51.8022 128.5206
0.851 15.0 2046 1.5891 56.0531 30.6748 38.8847 51.5739 128.7623
0.8825 16.0 2183 1.5978 56.0084 30.7943 38.9692 51.5587 128.7798
0.8169 17.0 2319 1.6076 55.8274 30.41 38.6258 51.3009 128.8632
0.8194 17.99 2455 1.6177 56.3214 30.9896 39.4754 51.9525 128.6461
0.8441 19.0 2592 1.6260 55.9842 30.6332 38.999 51.5685 128.8241
0.792 19.94 2720 1.6328 55.8983 30.5018 38.7764 51.3611 128.7407

Framework versions

  • Transformers 4.37.0.dev0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.5
  • Tokenizers 0.15.0