Edit model card

base_model_custom_tokenizer

This model is a fine-tuned version of t5-base on the code_search_net dataset. It achieves the following results on the evaluation set:

  • Loss: 2.9297
  • Bleu: 0.0419
  • Precisions: [0.16646886171883812, 0.051341379400381214, 0.025538496667355304, 0.01408001744219341]
  • Brevity Penalty: 1.0
  • Length Ratio: 1.9160
  • Translation Length: 1515803
  • Reference Length: 791127

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Bleu Brevity Penalty Length Ratio Validation Loss Precisions Reference Length Translation Length
3.9604 1.0 25762 0.0311 1.0 2.0901 3.8577 [0.12981129473835085, 0.037916946342151155, 0.018860549385742668, 0.010123458812721054] 791127 1653531
3.7556 2.0 51524 0.0304 1.0 2.0887 3.5650 [0.12978779415458075, 0.037579383019195466, 0.018120049525730805, 0.00967159578808246] 791127 1652405
3.5524 3.0 77286 0.0337 1.0 2.0745 3.4150 [0.1400710094937268, 0.04118126290523918, 0.0203289377688518, 0.01095848654003696] 791127 1641189
3.4698 4.0 103048 0.0340 1.0 2.0788 3.3056 [0.14277601173291565, 0.041700438046903744, 0.020391137906857287, 0.010998711103394348] 791127 1644604
3.3163 5.0 128810 0.0377 1.0 2.0193 3.2312 [0.15481298837386176, 0.04617083876865068, 0.022825576079888228, 0.012408874977873952] 791127 1597521
3.2458 6.0 154572 0.0382 1.0 1.9276 3.1719 [0.1593547435203856, 0.04704355006890476, 0.023023369844916947, 0.012389103841794662] 791127 1524975
3.1574 7.0 180334 0.0373 1.0 2.0231 3.1267 [0.15301209486452477, 0.04557636504175273, 0.022512350851579006, 0.012331176442211789] 791127 1600514
3.1398 8.0 206096 0.0386 1.0 1.9724 3.0893 [0.1577822509066417, 0.04745355472604797, 0.023342833604973825, 0.012766267921605798] 791127 1560429
3.0691 9.0 231858 0.0399 1.0 1.9159 3.0574 [0.16179891666501725, 0.0490436396529825, 0.024170720153435545, 0.013205125551162357] 791127 1515690
3.0536 10.0 257620 0.0410 1.0 1.8550 3.0321 [0.1656489584760067, 0.05027218283158705, 0.024914277684092188, 0.013668271409759075] 791127 1467513
3.0379 11.0 283382 0.0404 1.0 1.8928 3.0082 [0.1630008107267023, 0.049590989569352824, 0.02452930558336929, 0.013463575807213558] 791127 1497422
3.0183 12.0 309144 0.0409 1.0 1.9428 2.9924 [0.16253787482001938, 0.049984123536708294, 0.02498794115282579, 0.01380309274144192] 791127 1536971
2.9442 13.0 334906 0.0413 1.0 1.9288 2.9773 [0.16426924674922966, 0.05052962811986506, 0.025225357778251727, 0.013893123599262487] 791127 1525946
2.9746 14.0 360668 0.0411 1.0 1.9154 2.9622 [0.16395222297528722, 0.050373776569881686, 0.02506334156586741, 0.013817874614866431] 791127 1515289
2.9556 15.0 386430 0.0416 1.0 1.8903 2.9505 [0.16631916674913938, 0.05114349827528396, 0.025291167834370104, 0.013919582587470626] 791127 1495444
2.9423 16.0 412192 0.0415 1.0 1.9161 2.9441 [0.1656048056193977, 0.050903942131636466, 0.02527336097239107, 0.013901882376966617] 791127 1515892
2.9257 17.0 437954 0.0417 1.0 1.9204 2.9387 [0.16566872310834463, 0.051149695919205686, 0.02547749541013215, 0.01403388257902964] 791127 1519291
2.9023 18.0 463716 0.0417 1.0 1.9252 2.9331 [0.16569868978430946, 0.05118214894137258, 0.025432645752525008, 0.014019028423183673] 791127 1523108
2.946 19.0 489478 0.0420 1.0 1.9138 2.9301 [0.16682044755191178, 0.051534782710695386, 0.02563003483561942, 0.014141190855303378] 791127 1514059
2.8761 20.0 515240 2.9297 0.0419 [0.16646886171883812, 0.051341379400381214, 0.025538496667355304, 0.01408001744219341] 1.0 1.9160 1515803 791127

Framework versions

  • Transformers 4.37.2
  • Pytorch 2.2.0+cu121
  • Datasets 2.17.0
  • Tokenizers 0.15.2
Downloads last month
0
Safetensors
Model size
223M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for sc20fg/base_model_custom_tokenizer

Base model

google-t5/t5-base
Finetuned
(392)
this model

Dataset used to train sc20fg/base_model_custom_tokenizer

Evaluation results