Edit model card

ViT-Bert_Mimic

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1305
  • Rouge1: 34.725
  • Rouge2: 21.4916
  • Rougel: 33.3614
  • Rougelsum: 34.1142
  • Gen Len: 20.706

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
0.0684 1.0 7500 0.0752 34.4312 25.586 34.2067 34.2816 14.065
0.0626 2.0 15000 0.0694 38.0498 26.9882 37.2064 37.6682 19.492
0.0599 3.0 22500 0.0676 37.9403 26.7796 37.0514 37.571 21.805
0.054 4.0 30000 0.0661 38.1215 26.8065 37.3608 37.7763 18.883
0.0484 5.0 37500 0.0658 39.0689 27.489 38.0601 38.8175 20.556
0.043 6.0 45000 0.0679 38.5537 26.6503 37.4722 38.1314 20.994
0.0378 7.0 52500 0.0701 37.8821 26.1994 36.7872 37.4123 19.978
0.0324 8.0 60000 0.0741 38.5791 26.2187 37.3411 38.0767 21.761
0.0269 9.0 67500 0.0787 36.2698 24.3513 35.1553 35.7864 20.512
0.0199 10.0 75000 0.0848 34.8266 22.0111 33.591 34.3348 19.67
0.0158 11.0 82500 0.0921 34.5083 21.5876 33.273 34.0396 20.663
0.0114 12.0 90000 0.0990 33.6601 20.3509 32.3799 33.1785 21.574
0.0078 13.0 97500 0.1057 33.5222 20.262 32.3084 33.0449 20.7
0.0057 14.0 105000 0.1122 32.9482 19.0875 31.6809 32.4176 21.562
0.0037 15.0 112500 0.1172 33.2572 19.0712 31.8675 32.7193 21.432
0.0027 16.0 120000 0.1215 34.0583 20.5815 32.5961 33.4699 21.379
0.0019 17.0 127500 0.1257 34.3046 21.1929 33.0026 33.6992 20.687
0.0013 18.0 135000 0.1280 34.9621 21.8578 33.6017 34.3908 21.249
0.001 19.0 142500 0.1298 35.1328 21.8242 33.7634 34.5288 20.567
0.0007 20.0 150000 0.1305 34.725 21.4916 33.3614 34.1142 20.706

Framework versions

  • Transformers 4.37.1
  • Pytorch 1.13.1+cu117
  • Datasets 2.15.0
  • Tokenizers 0.15.1
Downloads last month
1
Safetensors
Model size
225M params
Tensor type
F32
·
Inference API
Inference API (serverless) does not yet support transformers models for this pipeline type.