Edit model card

spell_corrector_mt5_01012024

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6800
  • Bleu: 31.1934
  • Gen Len: 15.8188

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 15

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
13.9865 1.0 976 1.6965 15.4349 13.1848
2.8374 2.0 1952 1.2468 22.1368 15.9172
2.0333 3.0 2928 1.0291 25.1577 15.9044
1.7435 4.0 3904 0.9051 26.9628 15.9317
1.5687 5.0 4880 0.8453 28.2062 15.9029
1.4484 6.0 5856 0.8009 28.855 15.8879
1.3627 7.0 6832 0.7722 29.3878 15.8645
1.3183 8.0 7808 0.7467 29.6686 15.8503
1.2669 9.0 8784 0.7315 30.0695 15.8451
1.2321 10.0 9760 0.7162 30.3715 15.8328
1.1928 11.0 10736 0.6997 30.7385 15.8269
1.1848 12.0 11712 0.6934 30.9314 15.8242
1.1658 13.0 12688 0.6880 31.0662 15.8238
1.1476 14.0 13664 0.6826 31.1585 15.8183
1.1496 15.0 14640 0.6800 31.1934 15.8188

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.16.1
  • Tokenizers 0.15.0
Downloads last month
2
Safetensors
Model size
300M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Buseak/spell_corrector_mt5_01012024

Base model

google/mt5-small
Finetuned
(302)
this model
Finetunes
1 model