scenario-kd-pre-ner-half-mdeberta_data-univner_full66
This model is a fine-tuned version of microsoft/mdeberta-v3-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 72.6935
- Precision: 0.6796
- Recall: 0.6181
- F1: 0.6474
- Accuracy: 0.9659
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 8
- eval_batch_size: 32
- seed: 66
- gradient_accumulation_steps: 4
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10
Training results
Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy |
---|---|---|---|---|---|---|---|
174.1041 | 0.29 | 500 | 134.2922 | 0.0 | 0.0 | 0.0 | 0.9241 |
125.5123 | 0.58 | 1000 | 119.1368 | 0.0 | 0.0 | 0.0 | 0.9241 |
112.735 | 0.87 | 1500 | 108.9108 | 0.0290 | 0.0003 | 0.0006 | 0.9243 |
103.9781 | 1.16 | 2000 | 100.5534 | 0.2224 | 0.0876 | 0.1257 | 0.9312 |
96.6859 | 1.46 | 2500 | 95.6493 | 0.2821 | 0.1262 | 0.1744 | 0.9343 |
92.331 | 1.75 | 3000 | 90.4952 | 0.4293 | 0.2782 | 0.3376 | 0.9430 |
88.0821 | 2.04 | 3500 | 87.3770 | 0.5074 | 0.3334 | 0.4024 | 0.9472 |
84.7183 | 2.33 | 4000 | 84.8580 | 0.5047 | 0.4222 | 0.4597 | 0.9513 |
82.5316 | 2.62 | 4500 | 83.2307 | 0.5728 | 0.3926 | 0.4659 | 0.9517 |
80.8193 | 2.91 | 5000 | 81.3109 | 0.5713 | 0.4522 | 0.5048 | 0.9550 |
78.7442 | 3.2 | 5500 | 80.1857 | 0.5653 | 0.5155 | 0.5392 | 0.9577 |
77.6656 | 3.49 | 6000 | 79.3523 | 0.5833 | 0.5346 | 0.5579 | 0.9588 |
76.4846 | 3.78 | 6500 | 78.3908 | 0.6190 | 0.5148 | 0.5621 | 0.9587 |
75.9335 | 4.08 | 7000 | 77.5087 | 0.6213 | 0.5347 | 0.5748 | 0.9602 |
74.7006 | 4.37 | 7500 | 77.0236 | 0.6098 | 0.5647 | 0.5864 | 0.9607 |
74.0752 | 4.66 | 8000 | 76.5373 | 0.6282 | 0.5696 | 0.5975 | 0.9615 |
73.8701 | 4.95 | 8500 | 75.8483 | 0.6267 | 0.5887 | 0.6071 | 0.9629 |
73.1911 | 5.24 | 9000 | 75.4438 | 0.6511 | 0.5876 | 0.6177 | 0.9637 |
72.5251 | 5.53 | 9500 | 75.0009 | 0.6443 | 0.5826 | 0.6119 | 0.9632 |
71.9142 | 5.82 | 10000 | 74.7027 | 0.6427 | 0.6166 | 0.6294 | 0.9642 |
71.6262 | 6.11 | 10500 | 74.3620 | 0.6683 | 0.5985 | 0.6315 | 0.9642 |
71.2638 | 6.4 | 11000 | 74.2337 | 0.6808 | 0.5713 | 0.6213 | 0.9640 |
71.1012 | 6.69 | 11500 | 73.8078 | 0.6718 | 0.6166 | 0.6430 | 0.9651 |
70.8483 | 6.99 | 12000 | 73.6011 | 0.6728 | 0.6014 | 0.6351 | 0.9651 |
70.5965 | 7.28 | 12500 | 73.5710 | 0.6875 | 0.5869 | 0.6333 | 0.9649 |
70.2581 | 7.57 | 13000 | 73.3481 | 0.6653 | 0.6270 | 0.6456 | 0.9663 |
70.0625 | 7.86 | 13500 | 73.2402 | 0.6830 | 0.5982 | 0.6378 | 0.9652 |
69.9168 | 8.15 | 14000 | 73.1845 | 0.6806 | 0.6116 | 0.6443 | 0.9655 |
69.7048 | 8.44 | 14500 | 72.9083 | 0.6836 | 0.6156 | 0.6478 | 0.9657 |
69.5969 | 8.73 | 15000 | 72.7066 | 0.6783 | 0.6234 | 0.6497 | 0.9661 |
69.3262 | 9.02 | 15500 | 72.7057 | 0.6733 | 0.6218 | 0.6466 | 0.9658 |
69.3689 | 9.31 | 16000 | 72.6039 | 0.6847 | 0.6198 | 0.6507 | 0.9663 |
69.3319 | 9.61 | 16500 | 72.6736 | 0.6825 | 0.6247 | 0.6524 | 0.9661 |
69.2166 | 9.9 | 17000 | 72.6935 | 0.6796 | 0.6181 | 0.6474 | 0.9659 |
Framework versions
- Transformers 4.33.3
- Pytorch 2.1.1+cu121
- Datasets 2.14.5
- Tokenizers 0.13.3
- Downloads last month
- 0
Model tree for haryoaw/scenario-kd-pre-ner-half-mdeberta_data-univner_full66
Base model
microsoft/mdeberta-v3-base