pszemraj's picture
Update README.md
991b38d
|
raw
history blame
2.44 kB
metadata
license:
  - bsd-3-clause
  - apache-2.0
tags:
  - generated_from_trainer
datasets:
  - pszemraj/scientific_lay_summarisation-plos-norm
metrics:
  - rouge
model-index:
  - name: >-
      long-t5-tglobal-xl-16384-book-summary-scientific_lay_summarisation-plos-norm-16384-summ-v1
    results:
      - task:
          name: Summarization
          type: summarization
        dataset:
          name: pszemraj/scientific_lay_summarisation-plos-norm
          type: pszemraj/scientific_lay_summarisation-plos-norm
          split: validation
        metrics:
          - name: Rouge1
            type: rouge
            value: 44.3203
inference: false

long-t5-tglobal-xl-16384-booksci-summary-plos-10k

This model is a fine-tuned version of pszemraj/long-t5-tglobal-xl-16384-book-summary on the pszemraj/scientific_lay_summarisation-plos-norm dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5041
  • Rouge1: 44.3203
  • Rouge2: 11.0576
  • Rougel: 22.7584
  • Rougelsum: 40.1462
  • Gen Len: 256.66

Model description

Another test of further fine-tuning booksum-based models, this one fine-tuned on the PLOS subset of lay-summaries for about 10k examples input, to make it roughly equivalent to this checkpoint fine-tuned on the ELIFE subset for two epochs (also around 10k examples).

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 165
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.02
  • num_epochs: 1.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.7715 0.28 350 1.5310 43.4729 10.4616 22.1928 39.505 260.87
1.9307 0.56 700 1.5102 44.1634 10.9336 22.3896 40.2939 253.58
1.2981 0.84 1050 1.5046 44.2728 10.8455 22.4122 40.3019 261.29