haryoaw's picture
Upload tokenizer
e3adf6f verified
metadata
base_model: microsoft/mdeberta-v3-base
datasets:
  - massive
library_name: transformers
license: mit
metrics:
  - accuracy
  - f1
tags:
  - generated_from_trainer
model-index:
  - name: scenario-NON-KD-SCR-D2_data-AmazonScience_massive_all_1_155
    results:
      - task:
          type: text-classification
          name: Text Classification
        dataset:
          name: massive
          type: massive
          config: all_1.1
          split: validation
          args: all_1.1
        metrics:
          - type: accuracy
            value: 0.8146070604260471
            name: Accuracy
          - type: f1
            value: 0.7894820718803818
            name: F1

scenario-NON-KD-SCR-D2_data-AmazonScience_massive_all_1_155

This model is a fine-tuned version of microsoft/mdeberta-v3-base on the massive dataset. It achieves the following results on the evaluation set:

  • Loss: 1.9705
  • Accuracy: 0.8146
  • F1: 0.7895

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 55
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Accuracy F1
1.2731 0.2672 5000 1.2901 0.6543 0.5621
0.9748 0.5344 10000 0.9900 0.7385 0.6906
0.8375 0.8017 15000 0.8798 0.7648 0.7104
0.5776 1.0689 20000 0.8718 0.7814 0.7300
0.5844 1.3361 25000 0.8115 0.7917 0.7475
0.5128 1.6033 30000 0.8029 0.7986 0.7618
0.5164 1.8706 35000 0.7975 0.7978 0.7526
0.3296 2.1378 40000 0.8562 0.8030 0.7669
0.356 2.4050 45000 0.8397 0.8052 0.7660
0.3481 2.6722 50000 0.8293 0.8111 0.7798
0.3435 2.9394 55000 0.8290 0.8091 0.7780
0.2186 3.2067 60000 0.9522 0.8106 0.7777
0.2362 3.4739 65000 0.9482 0.8115 0.7782
0.2341 3.7411 70000 0.9290 0.8097 0.7801
0.2062 4.0083 75000 0.9605 0.8145 0.7868
0.1568 4.2756 80000 1.0468 0.8117 0.7825
0.1572 4.5428 85000 1.1166 0.8109 0.7838
0.1591 4.8100 90000 1.0949 0.8111 0.7859
0.0872 5.0772 95000 1.2311 0.8129 0.7868
0.0978 5.3444 100000 1.3205 0.8064 0.7780
0.104 5.6117 105000 1.2794 0.8124 0.7842
0.1035 5.8789 110000 1.2706 0.8140 0.7871
0.0615 6.1461 115000 1.4577 0.8114 0.7851
0.0692 6.4133 120000 1.4930 0.8097 0.7866
0.0662 6.6806 125000 1.5160 0.8125 0.7887
0.0685 6.9478 130000 1.5319 0.8124 0.7873
0.0481 7.2150 135000 1.6618 0.8107 0.7871
0.0448 7.4822 140000 1.7140 0.8119 0.7864
0.0405 7.7495 145000 1.7438 0.8141 0.7894
0.0303 8.0167 150000 1.8255 0.8116 0.7850
0.025 8.2839 155000 1.8547 0.8135 0.7898
0.0302 8.5511 160000 1.8674 0.8150 0.7891
0.0293 8.8183 165000 1.8820 0.8131 0.7890
0.0177 9.0856 170000 1.9414 0.8140 0.7906
0.0164 9.3528 175000 1.9824 0.8130 0.7898
0.019 9.6200 180000 1.9458 0.8139 0.7889
0.0203 9.8872 185000 1.9705 0.8146 0.7895

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.1.1+cu121
  • Datasets 2.14.5
  • Tokenizers 0.19.1