vietnamese_mt5_summary_model_2

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.4813
  • Rouge1: 57.2618
  • Rouge2: 23.5562
  • Rougel: 35.4717
  • Rougelsum: 37.1259

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 3000
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
1.5757 1.0 313 1.4225 57.3823 23.8437 35.6288 37.1079
1.5452 2.0 626 1.4219 56.809 23.4657 35.0977 36.7606
1.5074 3.0 939 1.4122 55.9095 23.0128 35.0274 36.4447
1.4701 4.0 1252 1.4256 56.621 23.1876 35.1323 36.5518
1.431 5.0 1565 1.4381 57.2067 23.6087 35.1239 36.7421
1.3929 6.0 1878 1.4338 57.1248 23.9446 35.3666 36.9974
1.3558 7.0 2191 1.4727 57.0482 23.1001 34.8187 36.1817
1.3197 8.0 2504 1.4928 56.0409 23.1702 35.4414 36.858
1.2861 9.0 2817 1.4917 57.1416 23.7555 35.5747 36.9418
1.2367 10.0 3130 1.4813 57.2618 23.5562 35.4717 37.1259

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.1+cu121
  • Datasets 2.19.2
  • Tokenizers 0.19.1
Downloads last month
7
Safetensors
Model size
300M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train chibao24/vietnamese_mt5_summary_model_2