Edit model card

working

This model is a fine-tuned version of csebuetnlp/banglat5 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0912

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 6
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 48
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine_with_restarts
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 25

Training results

Training Loss Epoch Step Validation Loss
39.4442 1.0540 100 0.5396
0.6861 2.1080 200 0.3455
0.4661 3.1621 300 0.3372
0.4512 4.2161 400 0.3265
0.4309 5.2701 500 0.3141
0.4114 6.3241 600 0.2886
0.3679 7.3781 700 0.1776
0.208 8.4321 800 0.1162
0.4885 9.4862 900 0.1208
0.137 10.5402 1000 0.1012
0.1013 11.5942 1100 0.0948
0.0838 12.6482 1200 0.0894
0.0704 13.7022 1300 0.0878
0.0618 14.7563 1400 0.0902
0.0546 15.8103 1500 0.0880
0.049 16.8643 1600 0.0901
0.0443 17.9183 1700 0.0900
0.0417 18.9723 1800 0.0905
0.0396 20.0264 1900 0.0909
0.0373 21.0804 2000 0.0909
0.0369 22.1344 2100 0.0913
0.0358 23.1884 2200 0.0911
0.0356 24.2424 2300 0.0912

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.4.0
  • Datasets 2.21.0
  • Tokenizers 0.19.1
Downloads last month
2
Safetensors
Model size
248M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Nazmus201/working

Finetuned
(57)
this model