bert_baseline_prompt_adherence_task5_fold1
This model is a fine-tuned version of google-bert/bert-base-cased on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.5066
- Qwk: 0.6974
- Mse: 0.5063
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 5
Training results
Training Loss | Epoch | Step | Validation Loss | Qwk | Mse |
---|---|---|---|---|---|
No log | 0.0294 | 2 | 2.4485 | 0.0 | 2.4480 |
No log | 0.0588 | 4 | 2.0852 | 0.0360 | 2.0847 |
No log | 0.0882 | 6 | 1.7567 | 0.0 | 1.7562 |
No log | 0.1176 | 8 | 1.3263 | 0.0 | 1.3255 |
No log | 0.1471 | 10 | 1.0635 | 0.0442 | 1.0627 |
No log | 0.1765 | 12 | 0.9516 | 0.3585 | 0.9510 |
No log | 0.2059 | 14 | 0.9261 | 0.3426 | 0.9256 |
No log | 0.2353 | 16 | 0.8773 | 0.3943 | 0.8766 |
No log | 0.2647 | 18 | 0.9391 | 0.3022 | 0.9380 |
No log | 0.2941 | 20 | 0.8954 | 0.2692 | 0.8943 |
No log | 0.3235 | 22 | 0.7439 | 0.4042 | 0.7436 |
No log | 0.3529 | 24 | 0.8582 | 0.4086 | 0.8585 |
No log | 0.3824 | 26 | 0.7256 | 0.4861 | 0.7255 |
No log | 0.4118 | 28 | 0.7472 | 0.3281 | 0.7462 |
No log | 0.4412 | 30 | 0.8611 | 0.2114 | 0.8598 |
No log | 0.4706 | 32 | 0.7545 | 0.3131 | 0.7534 |
No log | 0.5 | 34 | 0.6515 | 0.5445 | 0.6515 |
No log | 0.5294 | 36 | 0.9425 | 0.6176 | 0.9434 |
No log | 0.5588 | 38 | 0.9623 | 0.5922 | 0.9632 |
No log | 0.5882 | 40 | 0.7612 | 0.6289 | 0.7619 |
No log | 0.6176 | 42 | 0.6043 | 0.4936 | 0.6041 |
No log | 0.6471 | 44 | 0.7140 | 0.3329 | 0.7132 |
No log | 0.6765 | 46 | 0.7308 | 0.3042 | 0.7299 |
No log | 0.7059 | 48 | 0.6274 | 0.3623 | 0.6268 |
No log | 0.7353 | 50 | 0.5875 | 0.4255 | 0.5874 |
No log | 0.7647 | 52 | 0.5894 | 0.4308 | 0.5894 |
No log | 0.7941 | 54 | 0.5709 | 0.4855 | 0.5708 |
No log | 0.8235 | 56 | 0.5683 | 0.5325 | 0.5682 |
No log | 0.8529 | 58 | 0.5675 | 0.5357 | 0.5675 |
No log | 0.8824 | 60 | 0.5786 | 0.5893 | 0.5787 |
No log | 0.9118 | 62 | 0.6576 | 0.6631 | 0.6581 |
No log | 0.9412 | 64 | 0.7169 | 0.6881 | 0.7177 |
No log | 0.9706 | 66 | 0.6195 | 0.6844 | 0.6196 |
No log | 1.0 | 68 | 0.6078 | 0.5790 | 0.6072 |
No log | 1.0294 | 70 | 0.5957 | 0.5911 | 0.5952 |
No log | 1.0588 | 72 | 0.6013 | 0.6785 | 0.6014 |
No log | 1.0882 | 74 | 0.6450 | 0.6741 | 0.6453 |
No log | 1.1176 | 76 | 0.6194 | 0.5884 | 0.6197 |
No log | 1.1471 | 78 | 0.5606 | 0.5246 | 0.5606 |
No log | 1.1765 | 80 | 0.5391 | 0.5332 | 0.5388 |
No log | 1.2059 | 82 | 0.5362 | 0.5989 | 0.5359 |
No log | 1.2353 | 84 | 0.5378 | 0.6384 | 0.5377 |
No log | 1.2647 | 86 | 0.5255 | 0.6231 | 0.5252 |
No log | 1.2941 | 88 | 0.5244 | 0.6387 | 0.5242 |
No log | 1.3235 | 90 | 0.5873 | 0.7276 | 0.5875 |
No log | 1.3529 | 92 | 0.6176 | 0.7331 | 0.6180 |
No log | 1.3824 | 94 | 0.5611 | 0.7005 | 0.5612 |
No log | 1.4118 | 96 | 0.5282 | 0.6661 | 0.5281 |
No log | 1.4412 | 98 | 0.5242 | 0.5851 | 0.5237 |
No log | 1.4706 | 100 | 0.5186 | 0.6157 | 0.5181 |
No log | 1.5 | 102 | 0.5152 | 0.6041 | 0.5147 |
No log | 1.5294 | 104 | 0.5122 | 0.6317 | 0.5119 |
No log | 1.5588 | 106 | 0.5594 | 0.7059 | 0.5596 |
No log | 1.5882 | 108 | 0.5694 | 0.7231 | 0.5696 |
No log | 1.6176 | 110 | 0.5407 | 0.7012 | 0.5408 |
No log | 1.6471 | 112 | 0.5077 | 0.6642 | 0.5075 |
No log | 1.6765 | 114 | 0.5118 | 0.6684 | 0.5116 |
No log | 1.7059 | 116 | 0.5244 | 0.6776 | 0.5242 |
No log | 1.7353 | 118 | 0.5586 | 0.6944 | 0.5587 |
No log | 1.7647 | 120 | 0.5758 | 0.7057 | 0.5759 |
No log | 1.7941 | 122 | 0.6111 | 0.7218 | 0.6114 |
No log | 1.8235 | 124 | 0.5891 | 0.7095 | 0.5893 |
No log | 1.8529 | 126 | 0.5425 | 0.6328 | 0.5420 |
No log | 1.8824 | 128 | 0.5435 | 0.5949 | 0.5428 |
No log | 1.9118 | 130 | 0.5082 | 0.6226 | 0.5079 |
No log | 1.9412 | 132 | 0.5134 | 0.6659 | 0.5135 |
No log | 1.9706 | 134 | 0.5564 | 0.6944 | 0.5567 |
No log | 2.0 | 136 | 0.5343 | 0.6532 | 0.5345 |
No log | 2.0294 | 138 | 0.5370 | 0.6560 | 0.5372 |
No log | 2.0588 | 140 | 0.5419 | 0.6827 | 0.5421 |
No log | 2.0882 | 142 | 0.5108 | 0.6510 | 0.5107 |
No log | 2.1176 | 144 | 0.5048 | 0.6191 | 0.5042 |
No log | 2.1471 | 146 | 0.5395 | 0.5780 | 0.5387 |
No log | 2.1765 | 148 | 0.5853 | 0.5666 | 0.5843 |
No log | 2.2059 | 150 | 0.5339 | 0.6041 | 0.5332 |
No log | 2.2353 | 152 | 0.5460 | 0.7059 | 0.5459 |
No log | 2.2647 | 154 | 0.5901 | 0.7249 | 0.5903 |
No log | 2.2941 | 156 | 0.5624 | 0.7169 | 0.5625 |
No log | 2.3235 | 158 | 0.5171 | 0.6760 | 0.5167 |
No log | 2.3529 | 160 | 0.5137 | 0.6556 | 0.5131 |
No log | 2.3824 | 162 | 0.5230 | 0.6913 | 0.5228 |
No log | 2.4118 | 164 | 0.5624 | 0.7203 | 0.5625 |
No log | 2.4412 | 166 | 0.5309 | 0.6919 | 0.5308 |
No log | 2.4706 | 168 | 0.5341 | 0.6092 | 0.5333 |
No log | 2.5 | 170 | 0.5850 | 0.5333 | 0.5839 |
No log | 2.5294 | 172 | 0.5378 | 0.5700 | 0.5369 |
No log | 2.5588 | 174 | 0.4988 | 0.6506 | 0.4986 |
No log | 2.5882 | 176 | 0.5462 | 0.7111 | 0.5466 |
No log | 2.6176 | 178 | 0.6278 | 0.7387 | 0.6286 |
No log | 2.6471 | 180 | 0.6471 | 0.7457 | 0.6480 |
No log | 2.6765 | 182 | 0.5723 | 0.7369 | 0.5728 |
No log | 2.7059 | 184 | 0.4996 | 0.6705 | 0.4996 |
No log | 2.7353 | 186 | 0.4884 | 0.6063 | 0.4878 |
No log | 2.7647 | 188 | 0.5184 | 0.5660 | 0.5176 |
No log | 2.7941 | 190 | 0.5029 | 0.5735 | 0.5022 |
No log | 2.8235 | 192 | 0.4804 | 0.6289 | 0.4801 |
No log | 2.8529 | 194 | 0.4910 | 0.6638 | 0.4909 |
No log | 2.8824 | 196 | 0.4884 | 0.6553 | 0.4882 |
No log | 2.9118 | 198 | 0.4825 | 0.6479 | 0.4820 |
No log | 2.9412 | 200 | 0.4897 | 0.6332 | 0.4890 |
No log | 2.9706 | 202 | 0.4887 | 0.6395 | 0.4882 |
No log | 3.0 | 204 | 0.4956 | 0.6753 | 0.4953 |
No log | 3.0294 | 206 | 0.5352 | 0.7060 | 0.5352 |
No log | 3.0588 | 208 | 0.5285 | 0.6963 | 0.5284 |
No log | 3.0882 | 210 | 0.5038 | 0.6692 | 0.5034 |
No log | 3.1176 | 212 | 0.4896 | 0.6577 | 0.4891 |
No log | 3.1471 | 214 | 0.4824 | 0.6421 | 0.4819 |
No log | 3.1765 | 216 | 0.4753 | 0.6339 | 0.4748 |
No log | 3.2059 | 218 | 0.4711 | 0.6395 | 0.4707 |
No log | 3.2353 | 220 | 0.4710 | 0.6508 | 0.4707 |
No log | 3.2647 | 222 | 0.4885 | 0.6844 | 0.4885 |
No log | 3.2941 | 224 | 0.5330 | 0.7203 | 0.5332 |
No log | 3.3235 | 226 | 0.5352 | 0.7167 | 0.5355 |
No log | 3.3529 | 228 | 0.5054 | 0.7001 | 0.5054 |
No log | 3.3824 | 230 | 0.4922 | 0.6832 | 0.4921 |
No log | 3.4118 | 232 | 0.4958 | 0.6822 | 0.4957 |
No log | 3.4412 | 234 | 0.4967 | 0.6925 | 0.4966 |
No log | 3.4706 | 236 | 0.4895 | 0.6686 | 0.4891 |
No log | 3.5 | 238 | 0.5030 | 0.6935 | 0.5028 |
No log | 3.5294 | 240 | 0.5286 | 0.7051 | 0.5286 |
No log | 3.5588 | 242 | 0.5580 | 0.7234 | 0.5581 |
No log | 3.5882 | 244 | 0.5419 | 0.7100 | 0.5418 |
No log | 3.6176 | 246 | 0.5053 | 0.6821 | 0.5049 |
No log | 3.6471 | 248 | 0.4991 | 0.6656 | 0.4986 |
No log | 3.6765 | 250 | 0.4973 | 0.6528 | 0.4967 |
No log | 3.7059 | 252 | 0.5008 | 0.6663 | 0.5004 |
No log | 3.7353 | 254 | 0.5220 | 0.7029 | 0.5218 |
No log | 3.7647 | 256 | 0.5495 | 0.7174 | 0.5495 |
No log | 3.7941 | 258 | 0.5589 | 0.7306 | 0.5589 |
No log | 3.8235 | 260 | 0.5377 | 0.7045 | 0.5375 |
No log | 3.8529 | 262 | 0.5102 | 0.6874 | 0.5098 |
No log | 3.8824 | 264 | 0.5099 | 0.6873 | 0.5094 |
No log | 3.9118 | 266 | 0.5259 | 0.6949 | 0.5255 |
No log | 3.9412 | 268 | 0.5359 | 0.6964 | 0.5356 |
No log | 3.9706 | 270 | 0.5399 | 0.6971 | 0.5396 |
No log | 4.0 | 272 | 0.5266 | 0.6904 | 0.5262 |
No log | 4.0294 | 274 | 0.5060 | 0.6846 | 0.5055 |
No log | 4.0588 | 276 | 0.5013 | 0.6800 | 0.5008 |
No log | 4.0882 | 278 | 0.5139 | 0.6966 | 0.5136 |
No log | 4.1176 | 280 | 0.5437 | 0.7060 | 0.5435 |
No log | 4.1471 | 282 | 0.5548 | 0.7156 | 0.5547 |
No log | 4.1765 | 284 | 0.5472 | 0.7077 | 0.5471 |
No log | 4.2059 | 286 | 0.5206 | 0.6987 | 0.5203 |
No log | 4.2353 | 288 | 0.5135 | 0.6877 | 0.5131 |
No log | 4.2647 | 290 | 0.5099 | 0.6846 | 0.5095 |
No log | 4.2941 | 292 | 0.5047 | 0.6839 | 0.5043 |
No log | 4.3235 | 294 | 0.4963 | 0.6646 | 0.4957 |
No log | 4.3529 | 296 | 0.4950 | 0.6537 | 0.4944 |
No log | 4.3824 | 298 | 0.4949 | 0.6661 | 0.4943 |
No log | 4.4118 | 300 | 0.4965 | 0.6708 | 0.4960 |
No log | 4.4412 | 302 | 0.4936 | 0.6622 | 0.4930 |
No log | 4.4706 | 304 | 0.4931 | 0.6599 | 0.4925 |
No log | 4.5 | 306 | 0.4944 | 0.6646 | 0.4939 |
No log | 4.5294 | 308 | 0.4933 | 0.6631 | 0.4927 |
No log | 4.5588 | 310 | 0.4952 | 0.6723 | 0.4947 |
No log | 4.5882 | 312 | 0.4969 | 0.6761 | 0.4964 |
No log | 4.6176 | 314 | 0.5000 | 0.6847 | 0.4996 |
No log | 4.6471 | 316 | 0.5008 | 0.6864 | 0.5005 |
No log | 4.6765 | 318 | 0.5019 | 0.6936 | 0.5015 |
No log | 4.7059 | 320 | 0.5012 | 0.6936 | 0.5009 |
No log | 4.7353 | 322 | 0.5028 | 0.6959 | 0.5024 |
No log | 4.7647 | 324 | 0.4999 | 0.6896 | 0.4995 |
No log | 4.7941 | 326 | 0.5008 | 0.6904 | 0.5004 |
No log | 4.8235 | 328 | 0.5035 | 0.6959 | 0.5031 |
No log | 4.8529 | 330 | 0.5058 | 0.6974 | 0.5055 |
No log | 4.8824 | 332 | 0.5079 | 0.6990 | 0.5076 |
No log | 4.9118 | 334 | 0.5079 | 0.6990 | 0.5076 |
No log | 4.9412 | 336 | 0.5064 | 0.6974 | 0.5061 |
No log | 4.9706 | 338 | 0.5064 | 0.6974 | 0.5061 |
No log | 5.0 | 340 | 0.5066 | 0.6974 | 0.5063 |
Framework versions
- Transformers 4.42.3
- Pytorch 2.1.2
- Datasets 2.20.0
- Tokenizers 0.19.1
- Downloads last month
- 5
Model tree for salbatarni/bert_baseline_prompt_adherence_task5_fold1
Base model
google-bert/bert-base-cased