--- license: mit base_model: sagorsarker/bangla-bert-base tags: - generated_from_trainer model-index: - name: sagorbert_nwp_finetuning_def_v3 results: [] --- # sagorbert_nwp_finetuning_def_v3 This model is a fine-tuned version of [sagorsarker/bangla-bert-base](https://huggingface.co/sagorsarker/bangla-bert-base) on the None dataset. It achieves the following results on the evaluation set: - Loss: 2.7146 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 8 - eval_batch_size: 8 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 50 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:-----:|:---------------:| | 4.1717 | 1.0 | 1551 | 3.9996 | | 3.7173 | 2.0 | 3102 | 3.6406 | | 3.4565 | 3.0 | 4653 | 3.4235 | | 3.2657 | 4.0 | 6204 | 3.2330 | | 3.1522 | 5.0 | 7755 | 3.2134 | | 3.0686 | 6.0 | 9306 | 3.1522 | | 2.9315 | 7.0 | 10857 | 3.0937 | | 2.8902 | 8.0 | 12408 | 3.0556 | | 2.7995 | 9.0 | 13959 | 3.0475 | | 2.7451 | 10.0 | 15510 | 2.9813 | | 2.7015 | 11.0 | 17061 | 2.9560 | | 2.6528 | 12.0 | 18612 | 2.9613 | | 2.5797 | 13.0 | 20163 | 2.9195 | | 2.5343 | 14.0 | 21714 | 2.8609 | | 2.4927 | 15.0 | 23265 | 2.8933 | | 2.4433 | 16.0 | 24816 | 2.8718 | | 2.3995 | 17.0 | 26367 | 2.8405 | | 2.3875 | 18.0 | 27918 | 2.8703 | | 2.3171 | 19.0 | 29469 | 2.8371 | | 2.319 | 20.0 | 31020 | 2.8027 | | 2.2824 | 21.0 | 32571 | 2.7959 | | 2.2633 | 22.0 | 34122 | 2.8165 | | 2.2149 | 23.0 | 35673 | 2.7747 | | 2.1812 | 24.0 | 37224 | 2.7879 | | 2.1677 | 25.0 | 38775 | 2.7723 | | 2.1521 | 26.0 | 40326 | 2.7887 | | 2.14 | 27.0 | 41877 | 2.7839 | | 2.059 | 28.0 | 43428 | 2.8150 | | 2.0881 | 29.0 | 44979 | 2.7617 | | 2.0583 | 30.0 | 46530 | 2.7491 | | 2.0574 | 31.0 | 48081 | 2.7303 | | 2.0416 | 32.0 | 49632 | 2.7490 | | 1.9837 | 33.0 | 51183 | 2.7419 | | 1.9747 | 34.0 | 52734 | 2.7409 | | 1.9486 | 35.0 | 54285 | 2.7757 | | 1.941 | 36.0 | 55836 | 2.7546 | | 1.9549 | 37.0 | 57387 | 2.7046 | | 1.9346 | 38.0 | 58938 | 2.7700 | | 1.8979 | 39.0 | 60489 | 2.7033 | | 1.9104 | 40.0 | 62040 | 2.7383 | | 1.8989 | 41.0 | 63591 | 2.6837 | | 1.8691 | 42.0 | 65142 | 2.7084 | | 1.8492 | 43.0 | 66693 | 2.7000 | | 1.8271 | 44.0 | 68244 | 2.6792 | | 1.8723 | 45.0 | 69795 | 2.7325 | | 1.8208 | 46.0 | 71346 | 2.6998 | | 1.8218 | 47.0 | 72897 | 2.7490 | | 1.8305 | 48.0 | 74448 | 2.7394 | | 1.8067 | 49.0 | 75999 | 2.6545 | | 1.7974 | 50.0 | 77550 | 2.6925 | ### Framework versions - Transformers 4.38.1 - Pytorch 2.1.0+cu121 - Datasets 2.17.1 - Tokenizers 0.15.2