Edit model card

scenario-kd-pre-ner-full-mdeberta_data-univner_full66

This model is a fine-tuned version of microsoft/mdeberta-v3-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 46.5988
  • Precision: 0.8195
  • Recall: 0.8293
  • F1: 0.8244
  • Accuracy: 0.9821

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 8
  • eval_batch_size: 32
  • seed: 66
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Precision Recall F1 Accuracy
153.8993 0.2911 500 113.5285 0.5780 0.1984 0.2954 0.9366
102.1183 0.5822 1000 93.7263 0.7193 0.6563 0.6864 0.9703
89.9837 0.8732 1500 86.5151 0.7453 0.7319 0.7385 0.9749
83.6429 1.1643 2000 81.3350 0.7717 0.7491 0.7602 0.9767
78.6507 1.4554 2500 77.3080 0.7857 0.7660 0.7757 0.9781
74.9508 1.7465 3000 73.9533 0.7926 0.7823 0.7874 0.9790
71.7188 2.0375 3500 71.1878 0.7980 0.7804 0.7891 0.9791
68.3575 2.3286 4000 68.3794 0.7947 0.8147 0.8046 0.9800
66.0274 2.6197 4500 66.0870 0.7946 0.8051 0.7998 0.9804
63.9924 2.9108 5000 64.1186 0.7967 0.8124 0.8045 0.9805
61.6666 3.2019 5500 62.2723 0.8050 0.8156 0.8103 0.9809
59.849 3.4929 6000 60.5987 0.8155 0.8085 0.8120 0.9809
58.2341 3.7840 6500 59.1194 0.8103 0.8197 0.8149 0.9808
56.8294 4.0751 7000 57.7059 0.8040 0.8205 0.8122 0.9810
55.3324 4.3662 7500 56.4590 0.8129 0.8254 0.8191 0.9815
54.162 4.6573 8000 55.4182 0.809 0.8171 0.8130 0.9812
53.0463 4.9483 8500 54.4110 0.8093 0.8224 0.8158 0.9810
51.9352 5.2394 9000 53.4110 0.8150 0.8222 0.8186 0.9813
50.9231 5.5305 9500 52.4684 0.8133 0.8246 0.8189 0.9815
50.117 5.8216 10000 51.6818 0.8171 0.8172 0.8171 0.9815
49.2981 6.1126 10500 50.9482 0.8163 0.8289 0.8225 0.9817
48.5083 6.4037 11000 50.3213 0.8192 0.8264 0.8228 0.9817
47.9777 6.6948 11500 49.7021 0.8190 0.8208 0.8199 0.9820
47.4172 6.9859 12000 49.1544 0.8169 0.8318 0.8243 0.9818
46.7837 7.2770 12500 48.7230 0.8154 0.8272 0.8212 0.9815
46.2461 7.5680 13000 48.2942 0.8146 0.8336 0.8240 0.9819
45.9849 7.8591 13500 47.8808 0.8189 0.8240 0.8214 0.9818
45.5189 8.1502 14000 47.5726 0.8208 0.8250 0.8229 0.9817
45.1838 8.4413 14500 47.2858 0.8187 0.8361 0.8273 0.9818
44.9718 8.7324 15000 47.0646 0.8227 0.8306 0.8266 0.9821
44.7603 9.0234 15500 46.8393 0.8199 0.8285 0.8242 0.9818
44.5092 9.3145 16000 46.7532 0.8221 0.8300 0.8260 0.9820
44.4749 9.6056 16500 46.6502 0.8206 0.8306 0.8256 0.9822
44.3616 9.8967 17000 46.5988 0.8195 0.8293 0.8244 0.9821

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.1.1+cu121
  • Datasets 2.14.5
  • Tokenizers 0.19.1
Downloads last month
0
Safetensors
Model size
236M params
Tensor type
F32
·
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for haryoaw/scenario-kd-pre-ner-full-mdeberta_data-univner_full66

Finetuned
(204)
this model