scenario-kd-pre-ner-full-mdeberta_data-univner_full66

This model is a fine-tuned version of microsoft/mdeberta-v3-base on the None dataset. It achieves the following results on the evaluation set:

Loss: 46.5988
Precision: 0.8195
Recall: 0.8293
F1: 0.8244
Accuracy: 0.9821

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 8
eval_batch_size: 32
seed: 66
gradient_accumulation_steps: 4
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss	Precision	Recall	F1	Accuracy
153.8993	0.2911	500	113.5285	0.5780	0.1984	0.2954	0.9366
102.1183	0.5822	1000	93.7263	0.7193	0.6563	0.6864	0.9703
89.9837	0.8732	1500	86.5151	0.7453	0.7319	0.7385	0.9749
83.6429	1.1643	2000	81.3350	0.7717	0.7491	0.7602	0.9767
78.6507	1.4554	2500	77.3080	0.7857	0.7660	0.7757	0.9781
74.9508	1.7465	3000	73.9533	0.7926	0.7823	0.7874	0.9790
71.7188	2.0375	3500	71.1878	0.7980	0.7804	0.7891	0.9791
68.3575	2.3286	4000	68.3794	0.7947	0.8147	0.8046	0.9800
66.0274	2.6197	4500	66.0870	0.7946	0.8051	0.7998	0.9804
63.9924	2.9108	5000	64.1186	0.7967	0.8124	0.8045	0.9805
61.6666	3.2019	5500	62.2723	0.8050	0.8156	0.8103	0.9809
59.849	3.4929	6000	60.5987	0.8155	0.8085	0.8120	0.9809
58.2341	3.7840	6500	59.1194	0.8103	0.8197	0.8149	0.9808
56.8294	4.0751	7000	57.7059	0.8040	0.8205	0.8122	0.9810
55.3324	4.3662	7500	56.4590	0.8129	0.8254	0.8191	0.9815
54.162	4.6573	8000	55.4182	0.809	0.8171	0.8130	0.9812
53.0463	4.9483	8500	54.4110	0.8093	0.8224	0.8158	0.9810
51.9352	5.2394	9000	53.4110	0.8150	0.8222	0.8186	0.9813
50.9231	5.5305	9500	52.4684	0.8133	0.8246	0.8189	0.9815
50.117	5.8216	10000	51.6818	0.8171	0.8172	0.8171	0.9815
49.2981	6.1126	10500	50.9482	0.8163	0.8289	0.8225	0.9817
48.5083	6.4037	11000	50.3213	0.8192	0.8264	0.8228	0.9817
47.9777	6.6948	11500	49.7021	0.8190	0.8208	0.8199	0.9820
47.4172	6.9859	12000	49.1544	0.8169	0.8318	0.8243	0.9818
46.7837	7.2770	12500	48.7230	0.8154	0.8272	0.8212	0.9815
46.2461	7.5680	13000	48.2942	0.8146	0.8336	0.8240	0.9819
45.9849	7.8591	13500	47.8808	0.8189	0.8240	0.8214	0.9818
45.5189	8.1502	14000	47.5726	0.8208	0.8250	0.8229	0.9817
45.1838	8.4413	14500	47.2858	0.8187	0.8361	0.8273	0.9818
44.9718	8.7324	15000	47.0646	0.8227	0.8306	0.8266	0.9821
44.7603	9.0234	15500	46.8393	0.8199	0.8285	0.8242	0.9818
44.5092	9.3145	16000	46.7532	0.8221	0.8300	0.8260	0.9820
44.4749	9.6056	16500	46.6502	0.8206	0.8306	0.8256	0.9822
44.3616	9.8967	17000	46.5988	0.8195	0.8293	0.8244	0.9821

Framework versions

Transformers 4.44.2
Pytorch 2.1.1+cu121
Datasets 2.14.5
Tokenizers 0.19.1

haryoaw
/

scenario-kd-pre-ner-full-mdeberta_data-univner_full66

scenario-kd-pre-ner-full-mdeberta_data-univner_full66

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for haryoaw/scenario-kd-pre-ner-full-mdeberta_data-univner_full66

Evaluation results