redadmiral's picture
Add base model stats
6d341a0
|
raw
history blame
1.18 kB

This Model is a fine-tuned version of T-systems summarization model v1.

We used 1000 examples of headline-content pairs from BR24 articles for the fine-tuning process.

Despite the small amount of training data, the tonality of the summarizations has changed significantly. Many of the resulting summaries do sound like a headline.

Training

We used the following parameters for training this model:

  • base model: deutsche-telekom/mt5-small-sum-de-en-v1
  • source_prefix: "summarize: "
  • batch size: 4
  • max_source_length: 400
  • max_target_length: 35
  • weight_decay: 0.01
  • number of train epochs: 1
  • learning rate: 5e-5

License

Since the base model is trained on the MLSUM dataset, this model may not be used for commercial use.

Stats

Model Rouge1 Rouge2 RougeL RougeLSum
headlines_test_small_example 13.573500 3.694700 12.560600 12.60000
deutsche-telekom/mt5-small-sum-de-en-v1 10.6488 2.9313 10.0527 10.0523