scenario-kd-pre-ner-full-mdeberta_data-univner_full66
This model is a fine-tuned version of microsoft/mdeberta-v3-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 46.5988
- Precision: 0.8195
- Recall: 0.8293
- F1: 0.8244
- Accuracy: 0.9821
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 8
- eval_batch_size: 32
- seed: 66
- gradient_accumulation_steps: 4
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10
Training results
Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy |
---|---|---|---|---|---|---|---|
153.8993 | 0.2911 | 500 | 113.5285 | 0.5780 | 0.1984 | 0.2954 | 0.9366 |
102.1183 | 0.5822 | 1000 | 93.7263 | 0.7193 | 0.6563 | 0.6864 | 0.9703 |
89.9837 | 0.8732 | 1500 | 86.5151 | 0.7453 | 0.7319 | 0.7385 | 0.9749 |
83.6429 | 1.1643 | 2000 | 81.3350 | 0.7717 | 0.7491 | 0.7602 | 0.9767 |
78.6507 | 1.4554 | 2500 | 77.3080 | 0.7857 | 0.7660 | 0.7757 | 0.9781 |
74.9508 | 1.7465 | 3000 | 73.9533 | 0.7926 | 0.7823 | 0.7874 | 0.9790 |
71.7188 | 2.0375 | 3500 | 71.1878 | 0.7980 | 0.7804 | 0.7891 | 0.9791 |
68.3575 | 2.3286 | 4000 | 68.3794 | 0.7947 | 0.8147 | 0.8046 | 0.9800 |
66.0274 | 2.6197 | 4500 | 66.0870 | 0.7946 | 0.8051 | 0.7998 | 0.9804 |
63.9924 | 2.9108 | 5000 | 64.1186 | 0.7967 | 0.8124 | 0.8045 | 0.9805 |
61.6666 | 3.2019 | 5500 | 62.2723 | 0.8050 | 0.8156 | 0.8103 | 0.9809 |
59.849 | 3.4929 | 6000 | 60.5987 | 0.8155 | 0.8085 | 0.8120 | 0.9809 |
58.2341 | 3.7840 | 6500 | 59.1194 | 0.8103 | 0.8197 | 0.8149 | 0.9808 |
56.8294 | 4.0751 | 7000 | 57.7059 | 0.8040 | 0.8205 | 0.8122 | 0.9810 |
55.3324 | 4.3662 | 7500 | 56.4590 | 0.8129 | 0.8254 | 0.8191 | 0.9815 |
54.162 | 4.6573 | 8000 | 55.4182 | 0.809 | 0.8171 | 0.8130 | 0.9812 |
53.0463 | 4.9483 | 8500 | 54.4110 | 0.8093 | 0.8224 | 0.8158 | 0.9810 |
51.9352 | 5.2394 | 9000 | 53.4110 | 0.8150 | 0.8222 | 0.8186 | 0.9813 |
50.9231 | 5.5305 | 9500 | 52.4684 | 0.8133 | 0.8246 | 0.8189 | 0.9815 |
50.117 | 5.8216 | 10000 | 51.6818 | 0.8171 | 0.8172 | 0.8171 | 0.9815 |
49.2981 | 6.1126 | 10500 | 50.9482 | 0.8163 | 0.8289 | 0.8225 | 0.9817 |
48.5083 | 6.4037 | 11000 | 50.3213 | 0.8192 | 0.8264 | 0.8228 | 0.9817 |
47.9777 | 6.6948 | 11500 | 49.7021 | 0.8190 | 0.8208 | 0.8199 | 0.9820 |
47.4172 | 6.9859 | 12000 | 49.1544 | 0.8169 | 0.8318 | 0.8243 | 0.9818 |
46.7837 | 7.2770 | 12500 | 48.7230 | 0.8154 | 0.8272 | 0.8212 | 0.9815 |
46.2461 | 7.5680 | 13000 | 48.2942 | 0.8146 | 0.8336 | 0.8240 | 0.9819 |
45.9849 | 7.8591 | 13500 | 47.8808 | 0.8189 | 0.8240 | 0.8214 | 0.9818 |
45.5189 | 8.1502 | 14000 | 47.5726 | 0.8208 | 0.8250 | 0.8229 | 0.9817 |
45.1838 | 8.4413 | 14500 | 47.2858 | 0.8187 | 0.8361 | 0.8273 | 0.9818 |
44.9718 | 8.7324 | 15000 | 47.0646 | 0.8227 | 0.8306 | 0.8266 | 0.9821 |
44.7603 | 9.0234 | 15500 | 46.8393 | 0.8199 | 0.8285 | 0.8242 | 0.9818 |
44.5092 | 9.3145 | 16000 | 46.7532 | 0.8221 | 0.8300 | 0.8260 | 0.9820 |
44.4749 | 9.6056 | 16500 | 46.6502 | 0.8206 | 0.8306 | 0.8256 | 0.9822 |
44.3616 | 9.8967 | 17000 | 46.5988 | 0.8195 | 0.8293 | 0.8244 | 0.9821 |
Framework versions
- Transformers 4.44.2
- Pytorch 2.1.1+cu121
- Datasets 2.14.5
- Tokenizers 0.19.1
- Downloads last month
- 0
Model tree for haryoaw/scenario-kd-pre-ner-full-mdeberta_data-univner_full66
Base model
microsoft/mdeberta-v3-base