metadata

license: other
tags:
  - generated_from_trainer
base_model: boun-tabi-LMG/TURNA
metrics:
  - rouge
  - bleu
model-index:
  - name: TURNA_spell_correction_product_search
    results: []

TURNA_spell_correction_product_search

This model is a fine-tuned version of boun-tabi-LMG/TURNA on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.1088
Rouge1: 0.8437
Rouge2: 0.7401
Rougel: 0.8435
Rougelsum: 0.8437
Bleu: 0.8713
Precisions: [0.8736109932988378, 0.8306083370157608, 0.8473118279569892, 0.9631336405529954]
Brevity Penalty: 0.9932
Length Ratio: 0.9933
Translation Length: 11789
Reference Length: 11869
Meteor: 0.7484
Score: 14.6658
Num Edits: 1709
Ref Length: 11653.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 64
eval_batch_size: 64
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 5

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Bleu	Precisions	Brevity Penalty	Length Ratio	Translation Length	Reference Length	Meteor	Score	Num Edits	Ref Length
No log	0.3335	1253	0.2447	0.7099	0.5537	0.7097	0.7097	0.7033	[0.7548184082863512, 0.6378386771213415, 0.6301176470588236, 0.906832298136646]	0.9711	0.9715	22881	23553	0.5852	27.8452	6474	23250.0
No log	0.6670	2506	0.1886	0.7586	0.6231	0.7584	0.7585	0.7555	[0.7995148154565933, 0.7114032405992051, 0.7142528735632184, 0.8698224852071006]	0.9799	0.9801	23084	23553	0.6454	22.8860	5321	23250.0
0.3827	1.0005	3759	0.1571	0.7810	0.6561	0.7807	0.7809	0.7947	[0.8183424557169332, 0.7418011058092858, 0.7430269775948788, 0.939297124600639]	0.9850	0.9851	23203	23553	0.6742	20.6710	4806	23250.0
0.3827	1.3340	5012	0.1458	0.7973	0.6822	0.7973	0.7974	0.8139	[0.8318891557995882, 0.7666015625, 0.7682954289574421, 0.9333333333333333]	0.9897	0.9898	23312	23553	0.6955	19.1441	4451	23250.0
0.3827	1.6676	6265	0.1320	0.8109	0.6993	0.8107	0.8111	0.8294	[0.8467426359922597, 0.7889852885703508, 0.788783355947535, 0.9453376205787781]	0.9873	0.9873	23255	23553	0.7111	17.6258	4098	23250.0
0.1238	2.0011	7518	0.1218	0.8205	0.7139	0.8205	0.8206	0.8462	[0.8559577028885832, 0.8045084439083233, 0.8144353369763205, 0.9607843137254902]	0.9877	0.9877	23264	23553	0.7231	16.5720	3853	23250.0
0.1238	2.3346	8771	0.1223	0.8246	0.7219	0.8247	0.8249	0.8506	[0.8575583882282488, 0.8074450590521752, 0.8080267558528428, 0.9639344262295082]	0.9925	0.9926	23378	23553	0.7298	16.1978	3766	23250.0
0.1238	2.6681	10024	0.1177	0.8319	0.7326	0.8320	0.8321	0.8580	[0.8628791114908159, 0.8155853840417598, 0.8160765976397238, 0.9671052631578947]	0.9939	0.9939	23410	23553	0.7379	15.6602	3641	23250.0
0.0686	3.0016	11277	0.1122	0.8388	0.7400	0.8391	0.8391	0.8623	[0.8686514886164624, 0.8236522257848036, 0.8239625167336011, 0.9607843137254902]	0.9940	0.9940	23411	23553	0.7462	15.0237	3493	23250.0
0.0686	3.3351	12530	0.1184	0.8398	0.7450	0.8397	0.8398	0.8682	[0.8676339190741608, 0.8243353328889876, 0.8229854689564069, 0.9735099337748344]	0.9979	0.9979	23503	23553	0.7488	14.9677	3480	23250.0
0.0686	3.6686	13783	0.1148	0.8440	0.7484	0.8441	0.8442	0.8716	[0.8706277359853798, 0.8271121294995935, 0.826640333552776, 0.9735099337748344]	0.9990	0.9990	23529	23553	0.7533	14.5806	3390	23250.0
0.0383	4.0021	15036	0.1134	0.8498	0.7547	0.8498	0.8500	0.8750	[0.8757069354084279, 0.8344307168750462, 0.8344676180021954, 0.9671052631578947]	0.9985	0.9985	23517	23553	0.7592	14.0516	3267	23250.0

Framework versions

Transformers 4.41.2
Pytorch 2.3.1+cu121
Datasets 2.19.2
Tokenizers 0.19.1

Citation Information

Uludoğan, G., Balal, Z. Y., Akkurt, F., Türker, M., Güngör, O., & Üsküdarlı, S. (2024).
Turna: A turkish encoder-decoder language model for enhanced understanding and generation. arXiv preprint arXiv:2401.14373.