scenario-KD-PO-MSV-EN-CL-D2_data-en-massive_all_1_144

This model is a fine-tuned version of haryoaw/scenario-MDBT-TCR_data-cl-massive_all_1_1 on the massive dataset. It achieves the following results on the evaluation set:

Loss: 11.8220
Accuracy: 0.4279
F1: 0.4121

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 32
seed: 44
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 30

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
No log	0.28	100	12.4285	0.1433	0.0512
No log	0.56	200	11.0024	0.3202	0.2004
No log	0.83	300	10.2163	0.3581	0.2694
No log	1.11	400	10.2991	0.3690	0.3079
6.8116	1.39	500	9.8243	0.3839	0.3435
6.8116	1.67	600	11.2641	0.3606	0.3247
6.8116	1.94	700	10.3190	0.3851	0.3485
6.8116	2.22	800	10.3062	0.4031	0.3666
6.8116	2.5	900	10.6951	0.3961	0.3569
2.6735	2.78	1000	9.7089	0.4174	0.3764
2.6735	3.06	1100	11.1929	0.3794	0.3381
2.6735	3.33	1200	12.2376	0.3596	0.3533
2.6735	3.61	1300	11.6097	0.3843	0.3389
2.6735	3.89	1400	12.9304	0.3542	0.3393
1.7404	4.17	1500	11.2853	0.4047	0.3643
1.7404	4.44	1600	11.8882	0.3952	0.3663
1.7404	4.72	1700	12.6707	0.3697	0.3461
1.7404	5.0	1800	12.5911	0.3778	0.3586
1.7404	5.28	1900	12.2372	0.3876	0.3636
1.2139	5.56	2000	13.7821	0.3691	0.3591
1.2139	5.83	2100	13.3563	0.3712	0.3682
1.2139	6.11	2200	13.6273	0.3692	0.3702
1.2139	6.39	2300	13.3701	0.3780	0.3645
1.2139	6.67	2400	13.4374	0.3802	0.3644
0.9583	6.94	2500	13.0415	0.3766	0.3572
0.9583	7.22	2600	12.2692	0.3994	0.3723
0.9583	7.5	2700	13.0153	0.3798	0.3610
0.9583	7.78	2800	13.5494	0.3779	0.3576
0.9583	8.06	2900	12.7093	0.3921	0.3723
0.7598	8.33	3000	15.0333	0.3523	0.3566
0.7598	8.61	3100	14.0520	0.3699	0.3675
0.7598	8.89	3200	13.2860	0.3828	0.3733
0.7598	9.17	3300	13.1891	0.3863	0.3715
0.7598	9.44	3400	13.7420	0.3739	0.3654
0.6633	9.72	3500	13.6071	0.3789	0.3715
0.6633	10.0	3600	12.7524	0.3865	0.3769
0.6633	10.28	3700	12.6783	0.3976	0.3893
0.6633	10.56	3800	13.6121	0.3756	0.3706
0.6633	10.83	3900	12.4699	0.4064	0.3983
0.5962	11.11	4000	12.5135	0.4031	0.3884
0.5962	11.39	4100	14.0802	0.3692	0.3748
0.5962	11.67	4200	12.0230	0.4129	0.3988
0.5962	11.94	4300	13.3263	0.3832	0.3845
0.5962	12.22	4400	12.9188	0.3991	0.3804
0.5363	12.5	4500	12.1398	0.4067	0.3917
0.5363	12.78	4600	13.3469	0.3868	0.3778
0.5363	13.06	4700	12.7122	0.3912	0.3814
0.5363	13.33	4800	13.4259	0.3855	0.3760
0.5363	13.61	4900	12.8922	0.3973	0.3830
0.5077	13.89	5000	12.4985	0.4065	0.3983
0.5077	14.17	5100	12.3003	0.4096	0.3962
0.5077	14.44	5200	12.7712	0.4089	0.3995
0.5077	14.72	5300	12.2024	0.4155	0.4029
0.5077	15.0	5400	12.0136	0.4214	0.4079
0.4741	15.28	5500	12.4490	0.4086	0.3918
0.4741	15.56	5600	12.2443	0.4131	0.4004
0.4741	15.83	5700	11.9125	0.4265	0.4132
0.4741	16.11	5800	12.1195	0.4160	0.4013
0.4741	16.39	5900	12.7143	0.4101	0.4014
0.4451	16.67	6000	12.2924	0.4146	0.4016
0.4451	16.94	6100	11.6365	0.4283	0.4073
0.4451	17.22	6200	11.5866	0.4213	0.4020
0.4451	17.5	6300	11.7064	0.4257	0.4069
0.4451	17.78	6400	12.3482	0.4104	0.3997
0.4282	18.06	6500	11.9962	0.4217	0.4051
0.4282	18.33	6600	12.4831	0.4135	0.4025
0.4282	18.61	6700	12.3656	0.4125	0.4056
0.4282	18.89	6800	12.3032	0.4137	0.3984
0.4282	19.17	6900	11.7594	0.4298	0.4118
0.4082	19.44	7000	11.5141	0.4324	0.4158
0.4082	19.72	7100	11.7421	0.4274	0.4178
0.4082	20.0	7200	11.6144	0.4311	0.4125
0.4082	20.28	7300	12.2621	0.4192	0.4069
0.4082	20.56	7400	12.0426	0.4171	0.4043
0.3952	20.83	7500	11.6613	0.4243	0.4085
0.3952	21.11	7600	12.0199	0.4193	0.4029
0.3952	21.39	7700	12.5562	0.4112	0.4052
0.3952	21.67	7800	12.1838	0.4206	0.4086
0.3952	21.94	7900	12.1778	0.4175	0.4038
0.3855	22.22	8000	11.7222	0.4285	0.4131
0.3855	22.5	8100	11.9441	0.4243	0.4086
0.3855	22.78	8200	11.9899	0.4257	0.4120
0.3855	23.06	8300	12.3196	0.4207	0.4143
0.3855	23.33	8400	11.8328	0.4268	0.4092
0.373	23.61	8500	11.8007	0.4300	0.4140
0.373	23.89	8600	11.9800	0.4222	0.4089
0.373	24.17	8700	12.1881	0.4192	0.4057
0.373	24.44	8800	12.3038	0.4163	0.4081
0.373	24.72	8900	12.1807	0.4210	0.4073
0.3656	25.0	9000	11.7511	0.4268	0.4108
0.3656	25.28	9100	11.9884	0.4218	0.4088
0.3656	25.56	9200	11.7588	0.4264	0.4081
0.3656	25.83	9300	11.6659	0.4289	0.4105
0.3656	26.11	9400	12.1028	0.4207	0.4068
0.3573	26.39	9500	11.6687	0.4317	0.4147
0.3573	26.67	9600	11.7249	0.4279	0.4109
0.3573	26.94	9700	11.6570	0.4273	0.4104
0.3573	27.22	9800	11.6475	0.4309	0.4135
0.3573	27.5	9900	11.7960	0.4262	0.4116
0.3518	27.78	10000	11.7591	0.4263	0.4123
0.3518	28.06	10100	11.9438	0.4225	0.4084
0.3518	28.33	10200	11.8072	0.4256	0.4116
0.3518	28.61	10300	11.8760	0.4254	0.4110
0.3518	28.89	10400	12.0118	0.4214	0.4089
0.3511	29.17	10500	11.9257	0.4251	0.4115
0.3511	29.44	10600	11.9128	0.4250	0.4101
0.3511	29.72	10700	11.8159	0.4270	0.4117
0.3511	30.0	10800	11.8220	0.4279	0.4121

Framework versions

Transformers 4.33.3
Pytorch 2.1.1+cu121
Datasets 2.14.5
Tokenizers 0.13.3

haryoaw
/

scenario-KD-PO-MSV-EN-CL-D2_data-en-massive_all_1_144

scenario-KD-PO-MSV-EN-CL-D2_data-en-massive_all_1_144

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for haryoaw/scenario-KD-PO-MSV-EN-CL-D2_data-en-massive_all_1_144

Evaluation results