scenario-KD-PR-MSV-EN-EN-D2_data-en-massive_all_1_155

This model is a fine-tuned version of haryoaw/scenario-MDBT-TCR_data-en-massive_all_1_1 on the massive dataset. It achieves the following results on the evaluation set:

Loss: 3.3196
Accuracy: 0.4000
F1: 0.3699

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 32
seed: 55
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 30

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
No log	0.28	100	3.7488	0.2740	0.1332
No log	0.56	200	3.7532	0.2882	0.2126
No log	0.83	300	3.5330	0.3461	0.2709
No log	1.11	400	3.5996	0.3329	0.2751
2.3142	1.39	500	3.5192	0.3493	0.2852
2.3142	1.67	600	3.4897	0.3511	0.2900
2.3142	1.94	700	3.4726	0.3593	0.3203
2.3142	2.22	800	3.5140	0.3578	0.3084
2.3142	2.5	900	3.5512	0.3474	0.3007
1.3972	2.78	1000	3.4956	0.3584	0.3205
1.3972	3.06	1100	3.4138	0.3751	0.3204
1.3972	3.33	1200	3.5342	0.3637	0.3218
1.3972	3.61	1300	3.4981	0.3607	0.3301
1.3972	3.89	1400	3.3664	0.3832	0.3389
1.1915	4.17	1500	3.4706	0.3685	0.3358
1.1915	4.44	1600	3.5094	0.3776	0.3451
1.1915	4.72	1700	3.5614	0.3575	0.3247
1.1915	5.0	1800	3.4497	0.3779	0.3280
1.1915	5.28	1900	3.5372	0.3560	0.3182
1.0674	5.56	2000	3.6683	0.3411	0.3208
1.0674	5.83	2100	3.5785	0.3517	0.3191
1.0674	6.11	2200	3.4856	0.3787	0.3421
1.0674	6.39	2300	3.6501	0.3562	0.3282
1.0674	6.67	2400	3.6527	0.3599	0.3446
1.0031	6.94	2500	3.5173	0.3712	0.3335
1.0031	7.22	2600	3.4004	0.3906	0.3434
1.0031	7.5	2700	3.3956	0.3882	0.3438
1.0031	7.78	2800	3.4553	0.3757	0.3337
1.0031	8.06	2900	3.5141	0.3785	0.3372
0.9544	8.33	3000	3.4607	0.3745	0.3343
0.9544	8.61	3100	3.5721	0.3698	0.3362
0.9544	8.89	3200	3.4986	0.3748	0.3461
0.9544	9.17	3300	3.5570	0.3638	0.3288
0.9544	9.44	3400	3.4755	0.3801	0.3485
0.9298	9.72	3500	3.5956	0.3633	0.3296
0.9298	10.0	3600	3.7990	0.3346	0.3274
0.9298	10.28	3700	3.4749	0.3801	0.3315
0.9298	10.56	3800	3.5354	0.3668	0.3312
0.9298	10.83	3900	3.5521	0.3653	0.3335
0.9048	11.11	4000	3.5742	0.3695	0.3573
0.9048	11.39	4100	3.6353	0.3566	0.3437
0.9048	11.67	4200	3.5652	0.3707	0.3462
0.9048	11.94	4300	3.5651	0.3657	0.3350
0.9048	12.22	4400	3.4828	0.3792	0.3402
0.8875	12.5	4500	3.4154	0.3903	0.3518
0.8875	12.78	4600	3.5579	0.3669	0.3446
0.8875	13.06	4700	3.5480	0.3678	0.3399
0.8875	13.33	4800	3.7011	0.3535	0.3374
0.8875	13.61	4900	3.5428	0.3682	0.3547
0.8728	13.89	5000	3.5717	0.3697	0.3478
0.8728	14.17	5100	3.5094	0.3767	0.3472
0.8728	14.44	5200	3.5012	0.3688	0.3455
0.8728	14.72	5300	3.5059	0.3699	0.3451
0.8728	15.0	5400	3.4948	0.3834	0.3514
0.864	15.28	5500	3.4681	0.3805	0.3496
0.864	15.56	5600	3.6296	0.3571	0.3337
0.864	15.83	5700	3.4815	0.3774	0.3338
0.864	16.11	5800	3.5419	0.3714	0.3297
0.864	16.39	5900	3.4306	0.3868	0.3511
0.8581	16.67	6000	3.4905	0.3821	0.3566
0.8581	16.94	6100	3.3185	0.4046	0.3510
0.8581	17.22	6200	3.5655	0.3669	0.3322
0.8581	17.5	6300	3.4551	0.3848	0.3516
0.8581	17.78	6400	3.4727	0.3825	0.3495
0.8495	18.06	6500	3.4013	0.3863	0.3444
0.8495	18.33	6600	3.3959	0.3865	0.3545
0.8495	18.61	6700	3.3582	0.3876	0.3502
0.8495	18.89	6800	3.4716	0.3786	0.3425
0.8495	19.17	6900	3.3779	0.3912	0.3550
0.8449	19.44	7000	3.5027	0.3768	0.3494
0.8449	19.72	7100	3.3231	0.4070	0.3654
0.8449	20.0	7200	3.2727	0.4034	0.3706
0.8449	20.28	7300	3.4841	0.3778	0.3556
0.8449	20.56	7400	3.4613	0.3833	0.3505
0.8406	20.83	7500	3.4084	0.3861	0.3487
0.8406	21.11	7600	3.3010	0.3978	0.3590
0.8406	21.39	7700	3.3726	0.3909	0.3583
0.8406	21.67	7800	3.3891	0.3923	0.3596
0.8406	21.94	7900	3.4166	0.3859	0.3622
0.838	22.22	8000	3.3450	0.3940	0.3638
0.838	22.5	8100	3.3409	0.3977	0.3661
0.838	22.78	8200	3.3983	0.3930	0.3665
0.838	23.06	8300	3.4341	0.3814	0.3640
0.838	23.33	8400	3.4732	0.3769	0.3584
0.8354	23.61	8500	3.4941	0.3754	0.3475
0.8354	23.89	8600	3.4902	0.3706	0.3543
0.8354	24.17	8700	3.3955	0.3869	0.3577
0.8354	24.44	8800	3.4000	0.3896	0.3627
0.8354	24.72	8900	3.4061	0.3876	0.3593
0.8297	25.0	9000	3.3989	0.3864	0.3494
0.8297	25.28	9100	3.4073	0.3903	0.3585
0.8297	25.56	9200	3.3108	0.4050	0.3676
0.8297	25.83	9300	3.4202	0.3853	0.3587
0.8297	26.11	9400	3.3379	0.3987	0.3688
0.8291	26.39	9500	3.3224	0.4004	0.3664
0.8291	26.67	9600	3.2891	0.4051	0.3701
0.8291	26.94	9700	3.2901	0.4029	0.3705
0.8291	27.22	9800	3.3273	0.4008	0.3660
0.8291	27.5	9900	3.3488	0.3953	0.3699
0.8273	27.78	10000	3.3654	0.3938	0.3665
0.8273	28.06	10100	3.3521	0.3971	0.3695
0.8273	28.33	10200	3.2965	0.4055	0.3733
0.8273	28.61	10300	3.3683	0.3946	0.3679
0.8273	28.89	10400	3.3095	0.4033	0.3718
0.8267	29.17	10500	3.3116	0.4021	0.3726
0.8267	29.44	10600	3.3001	0.4044	0.3739
0.8267	29.72	10700	3.3072	0.4015	0.3701
0.8267	30.0	10800	3.3196	0.4000	0.3699

Framework versions

Transformers 4.33.3
Pytorch 2.1.1+cu121
Datasets 2.14.5
Tokenizers 0.13.3

haryoaw
/

scenario-KD-PR-MSV-EN-EN-D2_data-en-massive_all_1_155

scenario-KD-PR-MSV-EN-EN-D2_data-en-massive_all_1_155

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for haryoaw/scenario-KD-PR-MSV-EN-EN-D2_data-en-massive_all_1_155

Evaluation results