metadata
license: apache-2.0
base_model: google-bert/bert-base-cased
tags:
- generated_from_trainer
model-index:
- name: bert_baseline_prompt_adherence_task6_fold3
results: []
bert_baseline_prompt_adherence_task6_fold3
This model is a fine-tuned version of google-bert/bert-base-cased on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.4247
- Qwk: 0.7754
- Mse: 0.4247
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 5
Training results
Training Loss | Epoch | Step | Validation Loss | Qwk | Mse |
---|---|---|---|---|---|
No log | 0.0294 | 2 | 1.7612 | 0.0 | 1.7612 |
No log | 0.0588 | 4 | 1.4976 | -0.0486 | 1.4976 |
No log | 0.0882 | 6 | 1.3256 | -0.0045 | 1.3256 |
No log | 0.1176 | 8 | 1.0956 | 0.0 | 1.0956 |
No log | 0.1471 | 10 | 0.9927 | 0.0 | 0.9927 |
No log | 0.1765 | 12 | 0.9586 | 0.0 | 0.9586 |
No log | 0.2059 | 14 | 0.8770 | 0.0 | 0.8770 |
No log | 0.2353 | 16 | 0.8268 | 0.2233 | 0.8268 |
No log | 0.2647 | 18 | 0.7869 | 0.4102 | 0.7869 |
No log | 0.2941 | 20 | 0.7539 | 0.4667 | 0.7539 |
No log | 0.3235 | 22 | 0.6848 | 0.4524 | 0.6848 |
No log | 0.3529 | 24 | 0.6251 | 0.3985 | 0.6251 |
No log | 0.3824 | 26 | 0.5728 | 0.4507 | 0.5728 |
No log | 0.4118 | 28 | 0.5804 | 0.5026 | 0.5804 |
No log | 0.4412 | 30 | 0.6041 | 0.4997 | 0.6041 |
No log | 0.4706 | 32 | 0.4732 | 0.4918 | 0.4732 |
No log | 0.5 | 34 | 0.4584 | 0.5333 | 0.4584 |
No log | 0.5294 | 36 | 0.5895 | 0.5679 | 0.5895 |
No log | 0.5588 | 38 | 0.5094 | 0.6123 | 0.5094 |
No log | 0.5882 | 40 | 0.5028 | 0.6302 | 0.5028 |
No log | 0.6176 | 42 | 0.4369 | 0.6373 | 0.4369 |
No log | 0.6471 | 44 | 0.4934 | 0.4785 | 0.4934 |
No log | 0.6765 | 46 | 0.5004 | 0.4726 | 0.5004 |
No log | 0.7059 | 48 | 0.4262 | 0.7031 | 0.4262 |
No log | 0.7353 | 50 | 0.5825 | 0.6763 | 0.5825 |
No log | 0.7647 | 52 | 0.4681 | 0.6710 | 0.4681 |
No log | 0.7941 | 54 | 0.3762 | 0.6538 | 0.3762 |
No log | 0.8235 | 56 | 0.3737 | 0.6383 | 0.3737 |
No log | 0.8529 | 58 | 0.3822 | 0.6748 | 0.3822 |
No log | 0.8824 | 60 | 0.3905 | 0.6817 | 0.3905 |
No log | 0.9118 | 62 | 0.3853 | 0.6730 | 0.3853 |
No log | 0.9412 | 64 | 0.3845 | 0.6766 | 0.3845 |
No log | 0.9706 | 66 | 0.3865 | 0.6717 | 0.3865 |
No log | 1.0 | 68 | 0.4478 | 0.6696 | 0.4478 |
No log | 1.0294 | 70 | 0.5322 | 0.6747 | 0.5322 |
No log | 1.0588 | 72 | 0.4563 | 0.6951 | 0.4563 |
No log | 1.0882 | 74 | 0.4128 | 0.6789 | 0.4128 |
No log | 1.1176 | 76 | 0.4267 | 0.7117 | 0.4267 |
No log | 1.1471 | 78 | 0.5439 | 0.7687 | 0.5439 |
No log | 1.1765 | 80 | 0.5451 | 0.7671 | 0.5451 |
No log | 1.2059 | 82 | 0.3950 | 0.6714 | 0.3950 |
No log | 1.2353 | 84 | 0.3919 | 0.5799 | 0.3919 |
No log | 1.2647 | 86 | 0.4102 | 0.5500 | 0.4102 |
No log | 1.2941 | 88 | 0.3534 | 0.6453 | 0.3534 |
No log | 1.3235 | 90 | 0.4337 | 0.6800 | 0.4337 |
No log | 1.3529 | 92 | 0.5133 | 0.6574 | 0.5133 |
No log | 1.3824 | 94 | 0.4936 | 0.6765 | 0.4936 |
No log | 1.4118 | 96 | 0.3690 | 0.6859 | 0.3690 |
No log | 1.4412 | 98 | 0.3403 | 0.6334 | 0.3403 |
No log | 1.4706 | 100 | 0.3431 | 0.6301 | 0.3431 |
No log | 1.5 | 102 | 0.3333 | 0.6502 | 0.3333 |
No log | 1.5294 | 104 | 0.4377 | 0.7022 | 0.4377 |
No log | 1.5588 | 106 | 0.5763 | 0.7012 | 0.5763 |
No log | 1.5882 | 108 | 0.5293 | 0.6672 | 0.5293 |
No log | 1.6176 | 110 | 0.4315 | 0.6747 | 0.4315 |
No log | 1.6471 | 112 | 0.3583 | 0.6808 | 0.3583 |
No log | 1.6765 | 114 | 0.3511 | 0.6735 | 0.3511 |
No log | 1.7059 | 116 | 0.3478 | 0.6618 | 0.3478 |
No log | 1.7353 | 118 | 0.3474 | 0.6507 | 0.3474 |
No log | 1.7647 | 120 | 0.3620 | 0.6630 | 0.3620 |
No log | 1.7941 | 122 | 0.3894 | 0.6804 | 0.3894 |
No log | 1.8235 | 124 | 0.4057 | 0.6852 | 0.4057 |
No log | 1.8529 | 126 | 0.4497 | 0.6764 | 0.4497 |
No log | 1.8824 | 128 | 0.3986 | 0.6851 | 0.3986 |
No log | 1.9118 | 130 | 0.3293 | 0.6466 | 0.3293 |
No log | 1.9412 | 132 | 0.3306 | 0.6285 | 0.3306 |
No log | 1.9706 | 134 | 0.3381 | 0.6239 | 0.3381 |
No log | 2.0 | 136 | 0.3413 | 0.6706 | 0.3413 |
No log | 2.0294 | 138 | 0.4640 | 0.7789 | 0.4640 |
No log | 2.0588 | 140 | 0.5489 | 0.7778 | 0.5489 |
No log | 2.0882 | 142 | 0.4773 | 0.7819 | 0.4773 |
No log | 2.1176 | 144 | 0.3711 | 0.7206 | 0.3711 |
No log | 2.1471 | 146 | 0.3393 | 0.6336 | 0.3393 |
No log | 2.1765 | 148 | 0.3465 | 0.6045 | 0.3465 |
No log | 2.2059 | 150 | 0.3300 | 0.6413 | 0.3300 |
No log | 2.2353 | 152 | 0.3591 | 0.7270 | 0.3591 |
No log | 2.2647 | 154 | 0.4846 | 0.7874 | 0.4846 |
No log | 2.2941 | 156 | 0.4862 | 0.7887 | 0.4862 |
No log | 2.3235 | 158 | 0.3858 | 0.7422 | 0.3858 |
No log | 2.3529 | 160 | 0.3396 | 0.6951 | 0.3396 |
No log | 2.3824 | 162 | 0.3593 | 0.7225 | 0.3593 |
No log | 2.4118 | 164 | 0.4085 | 0.7388 | 0.4085 |
No log | 2.4412 | 166 | 0.4379 | 0.7507 | 0.4379 |
No log | 2.4706 | 168 | 0.3953 | 0.7282 | 0.3953 |
No log | 2.5 | 170 | 0.3341 | 0.6774 | 0.3341 |
No log | 2.5294 | 172 | 0.3235 | 0.6494 | 0.3235 |
No log | 2.5588 | 174 | 0.3218 | 0.6610 | 0.3218 |
No log | 2.5882 | 176 | 0.3445 | 0.6891 | 0.3445 |
No log | 2.6176 | 178 | 0.4174 | 0.7484 | 0.4174 |
No log | 2.6471 | 180 | 0.4074 | 0.7478 | 0.4074 |
No log | 2.6765 | 182 | 0.3458 | 0.7047 | 0.3458 |
No log | 2.7059 | 184 | 0.3242 | 0.6716 | 0.3242 |
No log | 2.7353 | 186 | 0.3304 | 0.6942 | 0.3304 |
No log | 2.7647 | 188 | 0.3594 | 0.7320 | 0.3594 |
No log | 2.7941 | 190 | 0.4304 | 0.7863 | 0.4304 |
No log | 2.8235 | 192 | 0.4484 | 0.7965 | 0.4484 |
No log | 2.8529 | 194 | 0.4058 | 0.7579 | 0.4058 |
No log | 2.8824 | 196 | 0.3506 | 0.7108 | 0.3506 |
No log | 2.9118 | 198 | 0.3382 | 0.7120 | 0.3382 |
No log | 2.9412 | 200 | 0.3547 | 0.7253 | 0.3547 |
No log | 2.9706 | 202 | 0.3676 | 0.7164 | 0.3676 |
No log | 3.0 | 204 | 0.3491 | 0.6914 | 0.3491 |
No log | 3.0294 | 206 | 0.3292 | 0.6816 | 0.3292 |
No log | 3.0588 | 208 | 0.3223 | 0.6803 | 0.3223 |
No log | 3.0882 | 210 | 0.3258 | 0.6989 | 0.3258 |
No log | 3.1176 | 212 | 0.3207 | 0.7000 | 0.3207 |
No log | 3.1471 | 214 | 0.3364 | 0.7168 | 0.3364 |
No log | 3.1765 | 216 | 0.3697 | 0.7496 | 0.3697 |
No log | 3.2059 | 218 | 0.3810 | 0.7657 | 0.3810 |
No log | 3.2353 | 220 | 0.3702 | 0.7464 | 0.3702 |
No log | 3.2647 | 222 | 0.3315 | 0.7005 | 0.3315 |
No log | 3.2941 | 224 | 0.3241 | 0.6858 | 0.3241 |
No log | 3.3235 | 226 | 0.3280 | 0.7003 | 0.3280 |
No log | 3.3529 | 228 | 0.3536 | 0.7327 | 0.3536 |
No log | 3.3824 | 230 | 0.4122 | 0.7663 | 0.4122 |
No log | 3.4118 | 232 | 0.4680 | 0.7739 | 0.4680 |
No log | 3.4412 | 234 | 0.4622 | 0.7703 | 0.4622 |
No log | 3.4706 | 236 | 0.4020 | 0.7561 | 0.4020 |
No log | 3.5 | 238 | 0.3521 | 0.7217 | 0.3521 |
No log | 3.5294 | 240 | 0.3329 | 0.6738 | 0.3329 |
No log | 3.5588 | 242 | 0.3358 | 0.6592 | 0.3358 |
No log | 3.5882 | 244 | 0.3324 | 0.6681 | 0.3324 |
No log | 3.6176 | 246 | 0.3434 | 0.7028 | 0.3434 |
No log | 3.6471 | 248 | 0.3973 | 0.7504 | 0.3973 |
No log | 3.6765 | 250 | 0.4777 | 0.7722 | 0.4777 |
No log | 3.7059 | 252 | 0.4898 | 0.7732 | 0.4898 |
No log | 3.7353 | 254 | 0.4411 | 0.7658 | 0.4411 |
No log | 3.7647 | 256 | 0.3781 | 0.7255 | 0.3781 |
No log | 3.7941 | 258 | 0.3491 | 0.7067 | 0.3491 |
No log | 3.8235 | 260 | 0.3422 | 0.6995 | 0.3422 |
No log | 3.8529 | 262 | 0.3415 | 0.7015 | 0.3415 |
No log | 3.8824 | 264 | 0.3556 | 0.7104 | 0.3556 |
No log | 3.9118 | 266 | 0.3833 | 0.7329 | 0.3833 |
No log | 3.9412 | 268 | 0.3956 | 0.7442 | 0.3956 |
No log | 3.9706 | 270 | 0.4082 | 0.7507 | 0.4082 |
No log | 4.0 | 272 | 0.3878 | 0.7382 | 0.3878 |
No log | 4.0294 | 274 | 0.3873 | 0.7374 | 0.3873 |
No log | 4.0588 | 276 | 0.3782 | 0.7269 | 0.3782 |
No log | 4.0882 | 278 | 0.3771 | 0.7269 | 0.3771 |
No log | 4.1176 | 280 | 0.3708 | 0.7163 | 0.3708 |
No log | 4.1471 | 282 | 0.3562 | 0.7133 | 0.3562 |
No log | 4.1765 | 284 | 0.3492 | 0.7124 | 0.3492 |
No log | 4.2059 | 286 | 0.3499 | 0.7111 | 0.3499 |
No log | 4.2353 | 288 | 0.3450 | 0.7103 | 0.3450 |
No log | 4.2647 | 290 | 0.3460 | 0.7077 | 0.3460 |
No log | 4.2941 | 292 | 0.3568 | 0.7159 | 0.3568 |
No log | 4.3235 | 294 | 0.3590 | 0.7237 | 0.3590 |
No log | 4.3529 | 296 | 0.3593 | 0.7223 | 0.3593 |
No log | 4.3824 | 298 | 0.3624 | 0.7269 | 0.3624 |
No log | 4.4118 | 300 | 0.3731 | 0.7336 | 0.3731 |
No log | 4.4412 | 302 | 0.3683 | 0.7301 | 0.3683 |
No log | 4.4706 | 304 | 0.3664 | 0.7280 | 0.3664 |
No log | 4.5 | 306 | 0.3577 | 0.7293 | 0.3577 |
No log | 4.5294 | 308 | 0.3529 | 0.7255 | 0.3529 |
No log | 4.5588 | 310 | 0.3467 | 0.7179 | 0.3467 |
No log | 4.5882 | 312 | 0.3452 | 0.7054 | 0.3452 |
No log | 4.6176 | 314 | 0.3451 | 0.7113 | 0.3451 |
No log | 4.6471 | 316 | 0.3514 | 0.7212 | 0.3514 |
No log | 4.6765 | 318 | 0.3596 | 0.7266 | 0.3596 |
No log | 4.7059 | 320 | 0.3735 | 0.7376 | 0.3735 |
No log | 4.7353 | 322 | 0.3884 | 0.7521 | 0.3884 |
No log | 4.7647 | 324 | 0.4072 | 0.7658 | 0.4072 |
No log | 4.7941 | 326 | 0.4235 | 0.7725 | 0.4235 |
No log | 4.8235 | 328 | 0.4324 | 0.7807 | 0.4324 |
No log | 4.8529 | 330 | 0.4356 | 0.7817 | 0.4356 |
No log | 4.8824 | 332 | 0.4360 | 0.7801 | 0.4360 |
No log | 4.9118 | 334 | 0.4343 | 0.7810 | 0.4343 |
No log | 4.9412 | 336 | 0.4306 | 0.7823 | 0.4306 |
No log | 4.9706 | 338 | 0.4266 | 0.7785 | 0.4266 |
No log | 5.0 | 340 | 0.4247 | 0.7754 | 0.4247 |
Framework versions
- Transformers 4.42.3
- Pytorch 2.1.2
- Datasets 2.20.0
- Tokenizers 0.19.1