scenario-KD-PR-MSV-EN-CL-D2_data-en-massive_all_1_166

This model is a fine-tuned version of haryoaw/scenario-MDBT-TCR_data-cl-massive_all_1_1 on the massive dataset. It achieves the following results on the evaluation set:

Loss: 3.1294
Accuracy: 0.4517
F1: 0.4433

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 32
seed: 66
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 30

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
No log	0.28	100	3.7346	0.2787	0.1429
No log	0.56	200	3.3645	0.3864	0.2813
No log	0.83	300	3.3794	0.3835	0.3163
No log	1.11	400	3.2108	0.4210	0.3411
2.3822	1.39	500	3.2084	0.4271	0.3699
2.3822	1.67	600	3.3062	0.4016	0.3490
2.3822	1.94	700	3.2449	0.4132	0.3835
2.3822	2.22	800	3.1548	0.4312	0.3880
2.3822	2.5	900	3.1709	0.4314	0.3838
1.4286	2.78	1000	3.2567	0.4224	0.3884
1.4286	3.06	1100	3.1783	0.4350	0.3915
1.4286	3.33	1200	3.2211	0.4300	0.3733
1.4286	3.61	1300	3.3106	0.4191	0.3951
1.4286	3.89	1400	3.2384	0.4332	0.4036
1.1816	4.17	1500	3.1592	0.4444	0.3974
1.1816	4.44	1600	3.2437	0.4177	0.3883
1.1816	4.72	1700	3.3608	0.4095	0.3988
1.1816	5.0	1800	3.2164	0.4222	0.3920
1.1816	5.28	1900	3.3678	0.4175	0.4033
1.0555	5.56	2000	3.2902	0.4247	0.4130
1.0555	5.83	2100	3.0966	0.4534	0.4204
1.0555	6.11	2200	3.2431	0.4367	0.4165
1.0555	6.39	2300	3.2783	0.4297	0.4044
1.0555	6.67	2400	3.4989	0.3955	0.3785
0.9971	6.94	2500	3.8710	0.3411	0.3596
0.9971	7.22	2600	3.6151	0.3881	0.3837
0.9971	7.5	2700	3.4787	0.3939	0.3997
0.9971	7.78	2800	3.3991	0.4045	0.3941
0.9971	8.06	2900	3.4382	0.4166	0.4154
0.9389	8.33	3000	3.2570	0.4235	0.4135
0.9389	8.61	3100	3.2388	0.4250	0.4056
0.9389	8.89	3200	3.4120	0.4067	0.4031
0.9389	9.17	3300	3.1757	0.4413	0.4144
0.9389	9.44	3400	3.3490	0.4163	0.4080
0.9179	9.72	3500	2.9801	0.4754	0.4437
0.9179	10.0	3600	3.2767	0.4280	0.4156
0.9179	10.28	3700	3.3163	0.4169	0.4131
0.9179	10.56	3800	3.2532	0.4307	0.4094
0.9179	10.83	3900	3.2696	0.4218	0.4004
0.8936	11.11	4000	3.2218	0.4317	0.4061
0.8936	11.39	4100	3.0951	0.4531	0.4236
0.8936	11.67	4200	3.3236	0.4216	0.4165
0.8936	11.94	4300	3.3463	0.4189	0.4076
0.8936	12.22	4400	3.2788	0.4258	0.4061
0.8822	12.5	4500	3.1698	0.4394	0.4218
0.8822	12.78	4600	3.1792	0.4463	0.4273
0.8822	13.06	4700	3.3204	0.4198	0.4161
0.8822	13.33	4800	3.2768	0.4350	0.4176
0.8822	13.61	4900	3.1899	0.4473	0.4319
0.8701	13.89	5000	3.2120	0.4381	0.4231
0.8701	14.17	5100	3.3195	0.4212	0.4145
0.8701	14.44	5200	3.1320	0.4493	0.4297
0.8701	14.72	5300	3.2009	0.4435	0.4250
0.8701	15.0	5400	3.1418	0.4453	0.4219
0.8598	15.28	5500	3.3812	0.4151	0.4237
0.8598	15.56	5600	3.3899	0.4179	0.4160
0.8598	15.83	5700	3.2094	0.4429	0.4344
0.8598	16.11	5800	3.2356	0.4420	0.4366
0.8598	16.39	5900	3.5436	0.3909	0.4047
0.8552	16.67	6000	3.1463	0.4484	0.4287
0.8552	16.94	6100	3.0971	0.4589	0.4393
0.8552	17.22	6200	3.3156	0.4183	0.4100
0.8552	17.5	6300	3.2175	0.4378	0.4298
0.8552	17.78	6400	3.2079	0.4402	0.4261
0.8465	18.06	6500	3.2534	0.4322	0.4185
0.8465	18.33	6600	3.1361	0.4483	0.4267
0.8465	18.61	6700	3.1913	0.4403	0.4295
0.8465	18.89	6800	3.0707	0.4600	0.4364
0.8465	19.17	6900	3.1861	0.4446	0.4315
0.8426	19.44	7000	3.0143	0.4689	0.4494
0.8426	19.72	7100	3.1831	0.4422	0.4359
0.8426	20.0	7200	3.1656	0.4489	0.4353
0.8426	20.28	7300	3.1168	0.4501	0.4406
0.8426	20.56	7400	3.1521	0.4489	0.4408
0.8402	20.83	7500	3.1576	0.4482	0.4385
0.8402	21.11	7600	3.0448	0.4631	0.4422
0.8402	21.39	7700	3.1503	0.4498	0.4423
0.8402	21.67	7800	3.1675	0.4445	0.4337
0.8402	21.94	7900	3.2237	0.4363	0.4309
0.8348	22.22	8000	3.1466	0.4461	0.4375
0.8348	22.5	8100	3.1429	0.4410	0.4272
0.8348	22.78	8200	3.4103	0.4102	0.4182
0.8348	23.06	8300	3.0529	0.4638	0.4445
0.8348	23.33	8400	3.2268	0.4380	0.4307
0.8332	23.61	8500	3.0921	0.4562	0.4461
0.8332	23.89	8600	3.2255	0.4397	0.4415
0.8332	24.17	8700	3.1758	0.4432	0.4360
0.8332	24.44	8800	3.2341	0.4352	0.4290
0.8332	24.72	8900	3.1512	0.4491	0.4381
0.8297	25.0	9000	3.0930	0.4553	0.4378
0.8297	25.28	9100	3.0608	0.4626	0.4447
0.8297	25.56	9200	3.1169	0.4520	0.4421
0.8297	25.83	9300	3.2131	0.4359	0.4319
0.8297	26.11	9400	3.1056	0.4515	0.4412
0.8269	26.39	9500	3.1172	0.4490	0.4427
0.8269	26.67	9600	3.1082	0.4514	0.4401
0.8269	26.94	9700	3.1088	0.4554	0.4427
0.8269	27.22	9800	3.1340	0.4509	0.4407
0.8269	27.5	9900	3.1682	0.4466	0.4416
0.827	27.78	10000	3.1441	0.4509	0.4433
0.827	28.06	10100	3.2030	0.4394	0.4336
0.827	28.33	10200	3.2133	0.4393	0.4359
0.827	28.61	10300	3.1405	0.4480	0.4354
0.827	28.89	10400	3.1575	0.4471	0.4375
0.825	29.17	10500	3.1558	0.4471	0.4382
0.825	29.44	10600	3.1283	0.4504	0.4395
0.825	29.72	10700	3.1274	0.4521	0.4403
0.825	30.0	10800	3.1294	0.4517	0.4433

Framework versions

Transformers 4.33.3
Pytorch 2.1.1+cu121
Datasets 2.14.5
Tokenizers 0.13.3

haryoaw
/

scenario-KD-PR-MSV-EN-CL-D2_data-en-massive_all_1_166

scenario-KD-PR-MSV-EN-CL-D2_data-en-massive_all_1_166

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for haryoaw/scenario-KD-PR-MSV-EN-CL-D2_data-en-massive_all_1_166

Evaluation results