maximo-t5-chat / README.md
maxadmin's picture
End of training
e6ad4d4
|
raw
history blame
2.46 kB
metadata
license: apache-2.0
base_model: google/flan-t5-small
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: maximo-t5-chat
    results: []

maximo-t5-chat

This model is a fine-tuned version of google/flan-t5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1056
  • Rouge1: 59.8095
  • Rouge2: 47.0
  • Rougel: 59.8095
  • Rougelsum: 59.8095
  • Gen Len: 14.6

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 5 2.8921 15.7143 0.0 15.7143 15.0 7.2
No log 2.0 10 2.1172 19.0 0.0 19.1905 18.5714 8.2
No log 3.0 15 1.7513 33.7143 20.0 34.0 33.3333 7.8
No log 4.0 20 1.4905 46.7143 34.0 47.0476 46.7143 12.3
No log 5.0 25 1.3527 55.5714 39.0 55.4762 54.5714 12.9
No log 6.0 30 1.3376 58.0952 39.0 58.0952 57.4286 13.6
No log 7.0 35 1.2002 58.5714 39.0 58.5714 57.8095 13.3
No log 8.0 40 1.1349 55.0476 39.0 54.5714 54.5714 14.3
No log 9.0 45 1.1106 59.8095 47.0 59.8095 59.8095 14.6
No log 10.0 50 1.1056 59.8095 47.0 59.8095 59.8095 14.6

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.0+cu118
  • Datasets 2.15.0
  • Tokenizers 0.15.0