metadata

base_model: microsoft/mdeberta-v3-base
datasets:
  - massive
library_name: transformers
license: mit
metrics:
  - accuracy
  - f1
tags:
  - generated_from_trainer
model-index:
  - name: scenario-NON-KD-SCR-D2_data-AmazonScience_massive_all_1_144
    results:
      - task:
          type: text-classification
          name: Text Classification
        dataset:
          name: massive
          type: massive
          config: all_1.1
          split: validation
          args: all_1.1
        metrics:
          - type: accuracy
            value: 0.8158178516024065
            name: Accuracy
          - type: f1
            value: 0.790068262479823
            name: F1

scenario-NON-KD-SCR-D2_data-AmazonScience_massive_all_1_144

This model is a fine-tuned version of microsoft/mdeberta-v3-base on the massive dataset. It achieves the following results on the evaluation set:

Loss: 1.9525
Accuracy: 0.8158
F1: 0.7901

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 32
seed: 44
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
1.2857	0.2672	5000	1.2982	0.6464	0.5527
0.9859	0.5344	10000	1.0042	0.7331	0.6625
0.8126	0.8017	15000	0.9054	0.7613	0.6991
0.5568	1.0689	20000	0.8391	0.7828	0.7319
0.5531	1.3361	25000	0.8316	0.7886	0.7394
0.5497	1.6033	30000	0.7894	0.8011	0.7618
0.5099	1.8706	35000	0.7805	0.8050	0.7616
0.3327	2.1378	40000	0.8676	0.8040	0.7633
0.3482	2.4050	45000	0.8556	0.8060	0.7700
0.3506	2.6722	50000	0.8309	0.8087	0.7816
0.3508	2.9394	55000	0.8149	0.8105	0.7683
0.221	3.2067	60000	0.9645	0.8070	0.7760
0.222	3.4739	65000	0.9305	0.8113	0.7836
0.2414	3.7411	70000	0.9195	0.8122	0.7846
0.2032	4.0083	75000	0.9858	0.8141	0.7855
0.1457	4.2756	80000	1.0865	0.8130	0.7885
0.155	4.5428	85000	1.0413	0.8133	0.7830
0.1535	4.8100	90000	1.0934	0.8157	0.7887
0.0888	5.0772	95000	1.2135	0.8152	0.7896
0.0931	5.3444	100000	1.3402	0.8121	0.7857
0.1024	5.6117	105000	1.2838	0.8107	0.7848
0.1044	5.8789	110000	1.3039	0.8133	0.7885
0.0595	6.1461	115000	1.4268	0.8129	0.7877
0.0678	6.4133	120000	1.4729	0.8132	0.7866
0.0676	6.6806	125000	1.5201	0.8127	0.7859
0.0779	6.9478	130000	1.4956	0.8151	0.7905
0.0429	7.2150	135000	1.6860	0.8142	0.7897
0.0507	7.4822	140000	1.6751	0.8124	0.7842
0.0463	7.7495	145000	1.7002	0.8133	0.7866
0.034	8.0167	150000	1.7596	0.8135	0.7885
0.0254	8.2839	155000	1.8539	0.8133	0.7876
0.0294	8.5511	160000	1.8675	0.8146	0.7862
0.0296	8.8183	165000	1.8644	0.8142	0.7862
0.0174	9.0856	170000	1.9111	0.8151	0.7899
0.0159	9.3528	175000	1.9342	0.8156	0.7896
0.0171	9.6200	180000	1.9399	0.8161	0.7901
0.0209	9.8872	185000	1.9525	0.8158	0.7901

Framework versions

Transformers 4.44.2
Pytorch 2.1.1+cu121
Datasets 2.14.5
Tokenizers 0.19.1