haryoaw's picture
Initial Commit
9d90f6b verified
|
raw
history blame
9.64 kB
metadata
license: mit
base_model: microsoft/mdeberta-v3-base
tags:
  - generated_from_trainer
datasets:
  - massive
metrics:
  - accuracy
  - f1
model-index:
  - name: scenario-KD-PR-MSV-D2_data-AmazonScience_massive_all_1_166
    results: []

scenario-KD-PR-MSV-D2_data-AmazonScience_massive_all_1_166

This model is a fine-tuned version of microsoft/mdeberta-v3-base on the massive dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5394
  • Accuracy: 0.8627
  • F1: 0.8453

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 66
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss Accuracy F1
3.4018 0.27 5000 3.3570 0.7587 0.6728
2.4831 0.53 10000 2.6998 0.8028 0.7596
2.1077 0.8 15000 2.4164 0.8202 0.7862
1.5375 1.07 20000 2.2821 0.8317 0.8012
1.4474 1.34 25000 2.2615 0.8325 0.8043
1.3836 1.6 30000 2.1448 0.8366 0.8133
1.2782 1.87 35000 2.1212 0.8390 0.8135
0.983 2.14 40000 2.0840 0.8429 0.8209
0.9416 2.41 45000 2.1779 0.8422 0.8206
0.9138 2.67 50000 2.0942 0.8447 0.8247
0.9093 2.94 55000 2.0603 0.8454 0.8196
0.7465 3.21 60000 2.0972 0.8441 0.8245
0.6996 3.47 65000 2.0355 0.8475 0.8270
0.7454 3.74 70000 1.9610 0.8487 0.8298
0.6906 4.01 75000 2.0084 0.8467 0.8288
0.6203 4.28 80000 1.9601 0.8498 0.8306
0.6039 4.54 85000 1.9766 0.8509 0.8347
0.616 4.81 90000 1.9302 0.8518 0.8295
0.5404 5.08 95000 1.9323 0.8512 0.8301
0.5448 5.34 100000 1.9360 0.8533 0.8383
0.5377 5.61 105000 1.9353 0.8511 0.8292
0.5373 5.88 110000 1.9015 0.8506 0.8318
0.4744 6.15 115000 1.9116 0.8527 0.8333
0.4885 6.41 120000 1.8676 0.8543 0.8370
0.4886 6.68 125000 1.8716 0.8546 0.8344
0.4861 6.95 130000 1.8664 0.8535 0.8319
0.4488 7.22 135000 1.8560 0.8547 0.8376
0.426 7.48 140000 1.8350 0.8535 0.8334
0.4451 7.75 145000 1.8258 0.8544 0.8333
0.4299 8.02 150000 1.8220 0.8560 0.8370
0.4207 8.28 155000 1.8048 0.8559 0.8373
0.4033 8.55 160000 1.8295 0.8538 0.8367
0.4039 8.82 165000 1.7818 0.8566 0.8391
0.3874 9.09 170000 1.7857 0.8563 0.8391
0.3843 9.35 175000 1.7860 0.8548 0.8374
0.3882 9.62 180000 1.8074 0.8558 0.8374
0.3866 9.89 185000 1.7823 0.8583 0.8404
0.36 10.15 190000 1.7294 0.8571 0.8375
0.3592 10.42 195000 1.7363 0.8578 0.8399
0.3628 10.69 200000 1.7460 0.8582 0.8385
0.3579 10.96 205000 1.7431 0.8580 0.8399
0.3448 11.22 210000 1.7398 0.8564 0.8378
0.3512 11.49 215000 1.7193 0.8584 0.8402
0.3367 11.76 220000 1.7197 0.8594 0.8425
0.327 12.03 225000 1.7189 0.8576 0.8385
0.3248 12.29 230000 1.6991 0.8602 0.8398
0.3306 12.56 235000 1.7119 0.8577 0.8404
0.3181 12.83 240000 1.6892 0.8606 0.8414
0.3167 13.09 245000 1.6647 0.8590 0.8380
0.3149 13.36 250000 1.6780 0.8590 0.8414
0.3221 13.63 255000 1.6626 0.8601 0.8437
0.3147 13.9 260000 1.7135 0.8595 0.8418
0.2954 14.16 265000 1.6915 0.8581 0.8390
0.2912 14.43 270000 1.6699 0.8582 0.8392
0.3123 14.7 275000 1.6659 0.8589 0.8399
0.3047 14.96 280000 1.6654 0.8610 0.8443
0.2916 15.23 285000 1.6408 0.8600 0.8421
0.282 15.5 290000 1.6729 0.8580 0.8405
0.2843 15.77 295000 1.6475 0.8600 0.8416
0.2764 16.03 300000 1.6342 0.8607 0.8426
0.2726 16.3 305000 1.6541 0.8597 0.8425
0.2895 16.57 310000 1.6280 0.8597 0.8413
0.2744 16.84 315000 1.6453 0.8607 0.8422
0.2727 17.1 320000 1.6319 0.8600 0.8432
0.2708 17.37 325000 1.6395 0.8599 0.8427
0.271 17.64 330000 1.6232 0.8600 0.8403
0.2695 17.9 335000 1.6294 0.8597 0.8419
0.2698 18.17 340000 1.6158 0.8620 0.8438
0.2582 18.44 345000 1.6214 0.8625 0.8448
0.2614 18.71 350000 1.6112 0.8610 0.8431
0.2583 18.97 355000 1.5978 0.8620 0.8440
0.258 19.24 360000 1.5902 0.8623 0.8446
0.2498 19.51 365000 1.6081 0.8611 0.8427
0.2569 19.77 370000 1.6165 0.8604 0.8420
0.2395 20.04 375000 1.5880 0.8614 0.8433
0.2527 20.31 380000 1.6055 0.8599 0.8428
0.2504 20.58 385000 1.5929 0.8614 0.8443
0.2494 20.84 390000 1.5841 0.8624 0.8444
0.2434 21.11 395000 1.5833 0.8614 0.8446
0.243 21.38 400000 1.5739 0.8619 0.8438
0.2389 21.65 405000 1.5816 0.8619 0.8438
0.2467 21.91 410000 1.5844 0.8616 0.8439
0.2352 22.18 415000 1.5748 0.8628 0.8446
0.2323 22.45 420000 1.5654 0.8623 0.8427
0.2314 22.71 425000 1.5537 0.8627 0.8449
0.238 22.98 430000 1.5613 0.8624 0.8424
0.223 23.25 435000 1.5661 0.8626 0.8441
0.2287 23.52 440000 1.5714 0.8627 0.8447
0.239 23.78 445000 1.5594 0.8634 0.8455
0.2275 24.05 450000 1.5629 0.8615 0.8436
0.2232 24.32 455000 1.5725 0.8618 0.8451
0.2267 24.58 460000 1.5550 0.8627 0.8455
0.2248 24.85 465000 1.5574 0.8633 0.8455
0.2214 25.12 470000 1.5602 0.8613 0.8432
0.2205 25.39 475000 1.5599 0.8617 0.8432
0.2189 25.65 480000 1.5395 0.8620 0.8452
0.2174 25.92 485000 1.5577 0.8625 0.8445
0.2148 26.19 490000 1.5533 0.8628 0.8457
0.2175 26.46 495000 1.5496 0.8619 0.8443
0.2121 26.72 500000 1.5509 0.8617 0.8443
0.2163 26.99 505000 1.5560 0.8624 0.8453
0.211 27.26 510000 1.5491 0.8629 0.8459
0.2142 27.52 515000 1.5576 0.8607 0.8438
0.2084 27.79 520000 1.5522 0.8624 0.8456
0.2119 28.06 525000 1.5429 0.8621 0.8449
0.2008 28.33 530000 1.5452 0.8627 0.8465
0.2084 28.59 535000 1.5458 0.8628 0.8464
0.2086 28.86 540000 1.5454 0.8622 0.8446
0.2102 29.13 545000 1.5487 0.8622 0.8449
0.2118 29.39 550000 1.5448 0.8621 0.8451
0.2049 29.66 555000 1.5411 0.8626 0.8454
0.2026 29.93 560000 1.5394 0.8627 0.8453

Framework versions

  • Transformers 4.33.3
  • Pytorch 2.1.1+cu121
  • Datasets 2.14.5
  • Tokenizers 0.13.3