maximo-t5-chat / README.md
maxadmin's picture
End of training
18b161a
|
raw
history blame
2.47 kB
metadata
license: apache-2.0
base_model: google/flan-t5-small
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: maximo-t5-chat
    results: []

maximo-t5-chat

This model is a fine-tuned version of google/flan-t5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1443
  • Rouge1: 27.8954
  • Rouge2: 7.9325
  • Rougel: 27.8954
  • Rougelsum: 27.0723
  • Gen Len: 12.5

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 3 2.8565 15.4762 2.439 15.0794 15.4762 16.1667
No log 2.0 6 2.5544 13.0291 2.439 12.6323 13.0291 16.1667
No log 3.0 9 2.4527 14.6825 2.439 14.2857 14.2857 10.8333
No log 4.0 12 2.3570 20.2381 2.439 19.8413 19.8413 11.1667
No log 5.0 15 2.2745 27.1017 7.9325 27.1017 26.2787 11.3333
No log 6.0 18 2.2170 27.8954 7.9325 27.8954 27.0723 13.1667
No log 7.0 21 2.1860 27.8954 7.9325 27.8954 27.0723 12.6667
No log 8.0 24 2.1568 27.8954 7.9325 27.8954 27.0723 12.5
No log 9.0 27 2.1445 27.8954 7.9325 27.8954 27.0723 12.5
No log 10.0 30 2.1443 27.8954 7.9325 27.8954 27.0723 12.5

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.0+cu118
  • Datasets 2.15.0
  • Tokenizers 0.15.0