metadata
license: mit
base_model: microsoft/deberta-v3-xsmall
tags:
- generated_from_trainer
metrics:
- accuracy
model-index:
- name: recommendation-news-clicked-random-select-and-filter
results: []
recommendation-news-clicked-random-select-and-filter
This model is a fine-tuned version of microsoft/deberta-v3-xsmall on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.5736
- Accuracy: 0.7001
- Macro F1: 0.6446
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 4.5e-05
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 128
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 1
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | Macro F1 |
---|---|---|---|---|---|
0.6446 | 0.0224 | 200 | 0.6369 | 0.6658 | 0.3997 |
0.65 | 0.0448 | 400 | 0.6337 | 0.6658 | 0.3997 |
0.6092 | 0.0672 | 600 | 0.6087 | 0.6725 | 0.5938 |
0.5988 | 0.0895 | 800 | 0.5995 | 0.6872 | 0.5664 |
0.5839 | 0.1119 | 1000 | 0.5979 | 0.6937 | 0.5908 |
0.6082 | 0.1343 | 1200 | 0.5879 | 0.6936 | 0.6149 |
0.5912 | 0.1567 | 1400 | 0.5857 | 0.6946 | 0.5626 |
0.5641 | 0.1791 | 1600 | 0.5848 | 0.6995 | 0.5927 |
0.5884 | 0.2015 | 1800 | 0.5797 | 0.6993 | 0.6093 |
0.5814 | 0.2239 | 2000 | 0.5807 | 0.6997 | 0.6100 |
0.5875 | 0.2462 | 2200 | 0.5774 | 0.7015 | 0.6151 |
0.5627 | 0.2686 | 2400 | 0.5796 | 0.6997 | 0.6302 |
0.5521 | 0.2910 | 2600 | 0.5856 | 0.7010 | 0.6140 |
0.5979 | 0.3134 | 2800 | 0.5742 | 0.7023 | 0.6094 |
0.6046 | 0.3358 | 3000 | 0.5792 | 0.6946 | 0.6408 |
0.5741 | 0.3582 | 3200 | 0.5781 | 0.7011 | 0.6301 |
0.566 | 0.3805 | 3400 | 0.5752 | 0.7013 | 0.6330 |
0.5589 | 0.4029 | 3600 | 0.5769 | 0.7010 | 0.6291 |
0.5758 | 0.4253 | 3800 | 0.5733 | 0.7033 | 0.6329 |
0.5714 | 0.4477 | 4000 | 0.5718 | 0.7044 | 0.6223 |
0.5797 | 0.4701 | 4200 | 0.5764 | 0.7021 | 0.6367 |
0.5669 | 0.4925 | 4400 | 0.5726 | 0.7022 | 0.6393 |
0.5655 | 0.5149 | 4600 | 0.5764 | 0.7062 | 0.6183 |
0.5743 | 0.5372 | 4800 | 0.5720 | 0.7053 | 0.6294 |
0.5657 | 0.5596 | 5000 | 0.5704 | 0.7047 | 0.6338 |
0.5766 | 0.5820 | 5200 | 0.5723 | 0.7031 | 0.6400 |
0.5748 | 0.6044 | 5400 | 0.5699 | 0.7067 | 0.6121 |
0.5669 | 0.6268 | 5600 | 0.5720 | 0.7048 | 0.6379 |
0.5557 | 0.6492 | 5800 | 0.5670 | 0.7071 | 0.6124 |
0.5675 | 0.6716 | 6000 | 0.5680 | 0.7075 | 0.6181 |
0.5808 | 0.6939 | 6200 | 0.5700 | 0.7066 | 0.6331 |
0.5792 | 0.7163 | 6400 | 0.5736 | 0.7001 | 0.6446 |
0.5583 | 0.7387 | 6600 | 0.5687 | 0.7060 | 0.6346 |
0.582 | 0.7611 | 6800 | 0.5667 | 0.7076 | 0.6248 |
0.5769 | 0.7835 | 7000 | 0.5694 | 0.7051 | 0.6411 |
0.568 | 0.8059 | 7200 | 0.5675 | 0.7081 | 0.6286 |
0.5712 | 0.8283 | 7400 | 0.5674 | 0.7084 | 0.6249 |
0.554 | 0.8506 | 7600 | 0.5675 | 0.7076 | 0.6350 |
0.5707 | 0.8730 | 7800 | 0.5661 | 0.7077 | 0.6347 |
0.577 | 0.8954 | 8000 | 0.5685 | 0.7066 | 0.6406 |
0.5766 | 0.9178 | 8200 | 0.5677 | 0.7077 | 0.6351 |
0.5992 | 0.9402 | 8400 | 0.5656 | 0.7084 | 0.6327 |
0.5744 | 0.9626 | 8600 | 0.5671 | 0.7061 | 0.6407 |
0.5748 | 0.9849 | 8800 | 0.5663 | 0.7078 | 0.6362 |
Framework versions
- Transformers 4.40.2
- Pytorch 2.3.0+cu121
- Datasets 2.19.1
- Tokenizers 0.19.1