DandinPower's picture
End of training
7acb087 verified
metadata
license: mit
base_model: microsoft/deberta-v3-xsmall
tags:
  - generated_from_trainer
metrics:
  - accuracy
model-index:
  - name: recommendation-news-clicked-random-select-and-filter
    results: []

recommendation-news-clicked-random-select-and-filter

This model is a fine-tuned version of microsoft/deberta-v3-xsmall on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5736
  • Accuracy: 0.7001
  • Macro F1: 0.6446

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 4.5e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Accuracy Macro F1
0.6446 0.0224 200 0.6369 0.6658 0.3997
0.65 0.0448 400 0.6337 0.6658 0.3997
0.6092 0.0672 600 0.6087 0.6725 0.5938
0.5988 0.0895 800 0.5995 0.6872 0.5664
0.5839 0.1119 1000 0.5979 0.6937 0.5908
0.6082 0.1343 1200 0.5879 0.6936 0.6149
0.5912 0.1567 1400 0.5857 0.6946 0.5626
0.5641 0.1791 1600 0.5848 0.6995 0.5927
0.5884 0.2015 1800 0.5797 0.6993 0.6093
0.5814 0.2239 2000 0.5807 0.6997 0.6100
0.5875 0.2462 2200 0.5774 0.7015 0.6151
0.5627 0.2686 2400 0.5796 0.6997 0.6302
0.5521 0.2910 2600 0.5856 0.7010 0.6140
0.5979 0.3134 2800 0.5742 0.7023 0.6094
0.6046 0.3358 3000 0.5792 0.6946 0.6408
0.5741 0.3582 3200 0.5781 0.7011 0.6301
0.566 0.3805 3400 0.5752 0.7013 0.6330
0.5589 0.4029 3600 0.5769 0.7010 0.6291
0.5758 0.4253 3800 0.5733 0.7033 0.6329
0.5714 0.4477 4000 0.5718 0.7044 0.6223
0.5797 0.4701 4200 0.5764 0.7021 0.6367
0.5669 0.4925 4400 0.5726 0.7022 0.6393
0.5655 0.5149 4600 0.5764 0.7062 0.6183
0.5743 0.5372 4800 0.5720 0.7053 0.6294
0.5657 0.5596 5000 0.5704 0.7047 0.6338
0.5766 0.5820 5200 0.5723 0.7031 0.6400
0.5748 0.6044 5400 0.5699 0.7067 0.6121
0.5669 0.6268 5600 0.5720 0.7048 0.6379
0.5557 0.6492 5800 0.5670 0.7071 0.6124
0.5675 0.6716 6000 0.5680 0.7075 0.6181
0.5808 0.6939 6200 0.5700 0.7066 0.6331
0.5792 0.7163 6400 0.5736 0.7001 0.6446
0.5583 0.7387 6600 0.5687 0.7060 0.6346
0.582 0.7611 6800 0.5667 0.7076 0.6248
0.5769 0.7835 7000 0.5694 0.7051 0.6411
0.568 0.8059 7200 0.5675 0.7081 0.6286
0.5712 0.8283 7400 0.5674 0.7084 0.6249
0.554 0.8506 7600 0.5675 0.7076 0.6350
0.5707 0.8730 7800 0.5661 0.7077 0.6347
0.577 0.8954 8000 0.5685 0.7066 0.6406
0.5766 0.9178 8200 0.5677 0.7077 0.6351
0.5992 0.9402 8400 0.5656 0.7084 0.6327
0.5744 0.9626 8600 0.5671 0.7061 0.6407
0.5748 0.9849 8800 0.5663 0.7078 0.6362

Framework versions

  • Transformers 4.40.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1