Edit model card

TURNA_spell_correction_product_search

This model is a fine-tuned version of boun-tabi-LMG/TURNA on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1088
  • Rouge1: 0.8437
  • Rouge2: 0.7401
  • Rougel: 0.8435
  • Rougelsum: 0.8437
  • Bleu: 0.8713
  • Precisions: [0.8736109932988378, 0.8306083370157608, 0.8473118279569892, 0.9631336405529954]
  • Brevity Penalty: 0.9932
  • Length Ratio: 0.9933
  • Translation Length: 11789
  • Reference Length: 11869
  • Meteor: 0.7484
  • Score: 14.6658
  • Num Edits: 1709
  • Ref Length: 11653.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Bleu Precisions Brevity Penalty Length Ratio Translation Length Reference Length Meteor Score Num Edits Ref Length
No log 0.3335 1253 0.2447 0.7099 0.5537 0.7097 0.7097 0.7033 [0.7548184082863512, 0.6378386771213415, 0.6301176470588236, 0.906832298136646] 0.9711 0.9715 22881 23553 0.5852 27.8452 6474 23250.0
No log 0.6670 2506 0.1886 0.7586 0.6231 0.7584 0.7585 0.7555 [0.7995148154565933, 0.7114032405992051, 0.7142528735632184, 0.8698224852071006] 0.9799 0.9801 23084 23553 0.6454 22.8860 5321 23250.0
0.3827 1.0005 3759 0.1571 0.7810 0.6561 0.7807 0.7809 0.7947 [0.8183424557169332, 0.7418011058092858, 0.7430269775948788, 0.939297124600639] 0.9850 0.9851 23203 23553 0.6742 20.6710 4806 23250.0
0.3827 1.3340 5012 0.1458 0.7973 0.6822 0.7973 0.7974 0.8139 [0.8318891557995882, 0.7666015625, 0.7682954289574421, 0.9333333333333333] 0.9897 0.9898 23312 23553 0.6955 19.1441 4451 23250.0
0.3827 1.6676 6265 0.1320 0.8109 0.6993 0.8107 0.8111 0.8294 [0.8467426359922597, 0.7889852885703508, 0.788783355947535, 0.9453376205787781] 0.9873 0.9873 23255 23553 0.7111 17.6258 4098 23250.0
0.1238 2.0011 7518 0.1218 0.8205 0.7139 0.8205 0.8206 0.8462 [0.8559577028885832, 0.8045084439083233, 0.8144353369763205, 0.9607843137254902] 0.9877 0.9877 23264 23553 0.7231 16.5720 3853 23250.0
0.1238 2.3346 8771 0.1223 0.8246 0.7219 0.8247 0.8249 0.8506 [0.8575583882282488, 0.8074450590521752, 0.8080267558528428, 0.9639344262295082] 0.9925 0.9926 23378 23553 0.7298 16.1978 3766 23250.0
0.1238 2.6681 10024 0.1177 0.8319 0.7326 0.8320 0.8321 0.8580 [0.8628791114908159, 0.8155853840417598, 0.8160765976397238, 0.9671052631578947] 0.9939 0.9939 23410 23553 0.7379 15.6602 3641 23250.0
0.0686 3.0016 11277 0.1122 0.8388 0.7400 0.8391 0.8391 0.8623 [0.8686514886164624, 0.8236522257848036, 0.8239625167336011, 0.9607843137254902] 0.9940 0.9940 23411 23553 0.7462 15.0237 3493 23250.0
0.0686 3.3351 12530 0.1184 0.8398 0.7450 0.8397 0.8398 0.8682 [0.8676339190741608, 0.8243353328889876, 0.8229854689564069, 0.9735099337748344] 0.9979 0.9979 23503 23553 0.7488 14.9677 3480 23250.0
0.0686 3.6686 13783 0.1148 0.8440 0.7484 0.8441 0.8442 0.8716 [0.8706277359853798, 0.8271121294995935, 0.826640333552776, 0.9735099337748344] 0.9990 0.9990 23529 23553 0.7533 14.5806 3390 23250.0
0.0383 4.0021 15036 0.1134 0.8498 0.7547 0.8498 0.8500 0.8750 [0.8757069354084279, 0.8344307168750462, 0.8344676180021954, 0.9671052631578947] 0.9985 0.9985 23517 23553 0.7592 14.0516 3267 23250.0

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.1+cu121
  • Datasets 2.19.2
  • Tokenizers 0.19.1

Citation Information

Uludoğan, G., Balal, Z. Y., Akkurt, F., Türker, M., Güngör, O., & Üsküdarlı, S. (2024).
Turna: A turkish encoder-decoder language model for enhanced understanding and generation. arXiv preprint arXiv:2401.14373.
Downloads last month
2
Safetensors
Model size
1.14B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Holmeister/TURNA_spell_correction_product_search

Finetuned
(7)
this model