scenario-KD-PR-MSV-EN-EN-D2_data-en-massive_all_1_144

This model is a fine-tuned version of haryoaw/scenario-MDBT-TCR_data-en-massive_all_1_1 on the massive dataset. It achieves the following results on the evaluation set:

Loss: 3.3870
Accuracy: 0.3917
F1: 0.3631

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 32
seed: 44
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 30

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
No log	0.28	100	3.7985	0.2429	0.1191
No log	0.56	200	3.5797	0.3350	0.2393
No log	0.83	300	3.4614	0.3511	0.2584
No log	1.11	400	3.4068	0.3689	0.2947
2.3847	1.39	500	3.5361	0.3480	0.3101
2.3847	1.67	600	3.8981	0.2962	0.2743
2.3847	1.94	700	3.5978	0.3348	0.3009
2.3847	2.22	800	3.4251	0.3693	0.3189
2.3847	2.5	900	3.6238	0.3387	0.2955
1.4359	2.78	1000	3.4170	0.3725	0.3228
1.4359	3.06	1100	3.4919	0.3577	0.3094
1.4359	3.33	1200	3.5121	0.3529	0.3200
1.4359	3.61	1300	3.5243	0.3552	0.3181
1.4359	3.89	1400	3.5490	0.3579	0.3271
1.2213	4.17	1500	3.7359	0.3382	0.3141
1.2213	4.44	1600	3.4488	0.3750	0.3190
1.2213	4.72	1700	3.8128	0.3207	0.3010
1.2213	5.0	1800	3.6438	0.3436	0.3157
1.2213	5.28	1900	3.6529	0.3533	0.3232
1.085	5.56	2000	3.7020	0.3460	0.3180
1.085	5.83	2100	3.5656	0.3617	0.3212
1.085	6.11	2200	3.7196	0.3451	0.3331
1.085	6.39	2300	3.4895	0.3783	0.3449
1.085	6.67	2400	3.4481	0.3827	0.3461
1.0193	6.94	2500	3.5108	0.3743	0.3371
1.0193	7.22	2600	3.6085	0.3680	0.3401
1.0193	7.5	2700	3.7560	0.3461	0.3396
1.0193	7.78	2800	3.6117	0.3654	0.3430
1.0193	8.06	2900	3.8823	0.3372	0.3342
0.9642	8.33	3000	4.1240	0.2905	0.3077
0.9642	8.61	3100	3.5464	0.3624	0.3257
0.9642	8.89	3200	3.7347	0.3436	0.3277
0.9642	9.17	3300	3.7061	0.3393	0.3172
0.9642	9.44	3400	3.7392	0.3448	0.3316
0.9379	9.72	3500	3.7291	0.3382	0.3217
0.9379	10.0	3600	3.4839	0.3661	0.3376
0.9379	10.28	3700	3.5460	0.3703	0.3383
0.9379	10.56	3800	3.5424	0.3719	0.3402
0.9379	10.83	3900	3.7746	0.3507	0.3373
0.9141	11.11	4000	3.6570	0.3653	0.3369
0.9141	11.39	4100	3.6878	0.3567	0.3366
0.9141	11.67	4200	3.4917	0.3786	0.3503
0.9141	11.94	4300	3.6285	0.3568	0.3375
0.9141	12.22	4400	3.7634	0.3416	0.3232
0.8926	12.5	4500	3.6110	0.3640	0.3335
0.8926	12.78	4600	3.7520	0.3365	0.3206
0.8926	13.06	4700	3.6192	0.3649	0.3343
0.8926	13.33	4800	3.6111	0.3648	0.3258
0.8926	13.61	4900	3.6608	0.3553	0.3316
0.881	13.89	5000	3.6331	0.3596	0.3414
0.881	14.17	5100	3.5635	0.3697	0.3486
0.881	14.44	5200	3.5596	0.3728	0.3476
0.881	14.72	5300	3.4594	0.3890	0.3505
0.881	15.0	5400	3.5156	0.3752	0.3387
0.8711	15.28	5500	3.7477	0.3417	0.3220
0.8711	15.56	5600	3.4787	0.3726	0.3433
0.8711	15.83	5700	3.3340	0.4009	0.3567
0.8711	16.11	5800	3.5768	0.3636	0.3398
0.8711	16.39	5900	3.5530	0.3682	0.3436
0.8624	16.67	6000	3.5606	0.3622	0.3428
0.8624	16.94	6100	3.5734	0.3639	0.3428
0.8624	17.22	6200	3.6723	0.3560	0.3326
0.8624	17.5	6300	3.4305	0.3926	0.3590
0.8624	17.78	6400	3.5705	0.3697	0.3485
0.8568	18.06	6500	3.5787	0.3717	0.3562
0.8568	18.33	6600	3.5437	0.3682	0.3459
0.8568	18.61	6700	3.4142	0.3933	0.3551
0.8568	18.89	6800	3.5347	0.3757	0.3533
0.8568	19.17	6900	3.4827	0.3751	0.3474
0.8485	19.44	7000	3.5962	0.3686	0.3475
0.8485	19.72	7100	3.6892	0.3526	0.3444
0.8485	20.0	7200	3.7340	0.3527	0.3421
0.8485	20.28	7300	3.6498	0.3529	0.3388
0.8485	20.56	7400	3.5198	0.3712	0.3440
0.8454	20.83	7500	3.5547	0.3731	0.3460
0.8454	21.11	7600	3.4824	0.3827	0.3530
0.8454	21.39	7700	3.7520	0.3479	0.3489
0.8454	21.67	7800	3.4160	0.3927	0.3530
0.8454	21.94	7900	3.4024	0.3916	0.3555
0.8442	22.22	8000	3.5260	0.3766	0.3571
0.8442	22.5	8100	3.7724	0.3411	0.3307
0.8442	22.78	8200	3.4421	0.3906	0.3611
0.8442	23.06	8300	3.5752	0.3697	0.3521
0.8442	23.33	8400	3.6166	0.3607	0.3474
0.8387	23.61	8500	3.4849	0.3772	0.3468
0.8387	23.89	8600	3.6369	0.3550	0.3435
0.8387	24.17	8700	3.5332	0.3731	0.3564
0.8387	24.44	8800	3.4314	0.3856	0.3612
0.8387	24.72	8900	3.5849	0.3646	0.3489
0.8373	25.0	9000	3.4793	0.3775	0.3532
0.8373	25.28	9100	3.4012	0.3874	0.3601
0.8373	25.56	9200	3.5138	0.3746	0.3531
0.8373	25.83	9300	3.3756	0.3955	0.3663
0.8373	26.11	9400	3.4281	0.3847	0.3546
0.8357	26.39	9500	3.3819	0.3928	0.3576
0.8357	26.67	9600	3.3574	0.3965	0.3640
0.8357	26.94	9700	3.3550	0.3962	0.3621
0.8357	27.22	9800	3.4785	0.3769	0.3571
0.8357	27.5	9900	3.5116	0.3717	0.3495
0.8341	27.78	10000	3.4470	0.3797	0.3562
0.8341	28.06	10100	3.4118	0.3878	0.3642
0.8341	28.33	10200	3.3945	0.3910	0.3637
0.8341	28.61	10300	3.4078	0.3854	0.3591
0.8341	28.89	10400	3.5367	0.3678	0.3548
0.8325	29.17	10500	3.4340	0.3825	0.3605
0.8325	29.44	10600	3.4028	0.3875	0.3604
0.8325	29.72	10700	3.3913	0.3904	0.3635
0.8325	30.0	10800	3.3870	0.3917	0.3631

Framework versions

Transformers 4.33.3
Pytorch 2.1.1+cu121
Datasets 2.14.5
Tokenizers 0.13.3

haryoaw
/

scenario-KD-PR-MSV-EN-EN-D2_data-en-massive_all_1_144

scenario-KD-PR-MSV-EN-EN-D2_data-en-massive_all_1_144

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for haryoaw/scenario-KD-PR-MSV-EN-EN-D2_data-en-massive_all_1_144

Evaluation results