scenario-kd-pre-ner-half-mdeberta_data-univner_full66

This model is a fine-tuned version of microsoft/mdeberta-v3-base on the None dataset. It achieves the following results on the evaluation set:

Loss: 72.6935
Precision: 0.6796
Recall: 0.6181
F1: 0.6474
Accuracy: 0.9659

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 8
eval_batch_size: 32
seed: 66
gradient_accumulation_steps: 4
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss	Precision	Recall	F1	Accuracy
174.1041	0.29	500	134.2922	0.0	0.0	0.0	0.9241
125.5123	0.58	1000	119.1368	0.0	0.0	0.0	0.9241
112.735	0.87	1500	108.9108	0.0290	0.0003	0.0006	0.9243
103.9781	1.16	2000	100.5534	0.2224	0.0876	0.1257	0.9312
96.6859	1.46	2500	95.6493	0.2821	0.1262	0.1744	0.9343
92.331	1.75	3000	90.4952	0.4293	0.2782	0.3376	0.9430
88.0821	2.04	3500	87.3770	0.5074	0.3334	0.4024	0.9472
84.7183	2.33	4000	84.8580	0.5047	0.4222	0.4597	0.9513
82.5316	2.62	4500	83.2307	0.5728	0.3926	0.4659	0.9517
80.8193	2.91	5000	81.3109	0.5713	0.4522	0.5048	0.9550
78.7442	3.2	5500	80.1857	0.5653	0.5155	0.5392	0.9577
77.6656	3.49	6000	79.3523	0.5833	0.5346	0.5579	0.9588
76.4846	3.78	6500	78.3908	0.6190	0.5148	0.5621	0.9587
75.9335	4.08	7000	77.5087	0.6213	0.5347	0.5748	0.9602
74.7006	4.37	7500	77.0236	0.6098	0.5647	0.5864	0.9607
74.0752	4.66	8000	76.5373	0.6282	0.5696	0.5975	0.9615
73.8701	4.95	8500	75.8483	0.6267	0.5887	0.6071	0.9629
73.1911	5.24	9000	75.4438	0.6511	0.5876	0.6177	0.9637
72.5251	5.53	9500	75.0009	0.6443	0.5826	0.6119	0.9632
71.9142	5.82	10000	74.7027	0.6427	0.6166	0.6294	0.9642
71.6262	6.11	10500	74.3620	0.6683	0.5985	0.6315	0.9642
71.2638	6.4	11000	74.2337	0.6808	0.5713	0.6213	0.9640
71.1012	6.69	11500	73.8078	0.6718	0.6166	0.6430	0.9651
70.8483	6.99	12000	73.6011	0.6728	0.6014	0.6351	0.9651
70.5965	7.28	12500	73.5710	0.6875	0.5869	0.6333	0.9649
70.2581	7.57	13000	73.3481	0.6653	0.6270	0.6456	0.9663
70.0625	7.86	13500	73.2402	0.6830	0.5982	0.6378	0.9652
69.9168	8.15	14000	73.1845	0.6806	0.6116	0.6443	0.9655
69.7048	8.44	14500	72.9083	0.6836	0.6156	0.6478	0.9657
69.5969	8.73	15000	72.7066	0.6783	0.6234	0.6497	0.9661
69.3262	9.02	15500	72.7057	0.6733	0.6218	0.6466	0.9658
69.3689	9.31	16000	72.6039	0.6847	0.6198	0.6507	0.9663
69.3319	9.61	16500	72.6736	0.6825	0.6247	0.6524	0.9661
69.2166	9.9	17000	72.6935	0.6796	0.6181	0.6474	0.9659

Framework versions

Transformers 4.33.3
Pytorch 2.1.1+cu121
Datasets 2.14.5
Tokenizers 0.13.3

haryoaw
/

scenario-kd-pre-ner-half-mdeberta_data-univner_full66

scenario-kd-pre-ner-half-mdeberta_data-univner_full66

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for haryoaw/scenario-kd-pre-ner-half-mdeberta_data-univner_full66

Evaluation results