scenario-KD-PO-MSV-EN-EN-D2_data-en-massive_all_1_155

This model is a fine-tuned version of haryoaw/scenario-MDBT-TCR_data-en-massive_all_1_1 on the massive dataset. It achieves the following results on the evaluation set:

Loss: 16.6549
Accuracy: 0.3446
F1: 0.3337

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 32
seed: 55
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 30

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
No log	0.28	100	14.6039	0.1460	0.0332
No log	0.56	200	13.6711	0.2254	0.1373
No log	0.83	300	13.3897	0.2697	0.1900
No log	1.11	400	12.8696	0.2938	0.2276
8.6227	1.39	500	13.2466	0.3041	0.2445
8.6227	1.67	600	13.7732	0.2966	0.2455
8.6227	1.94	700	12.6296	0.3268	0.2769
8.6227	2.22	800	13.8225	0.3207	0.2728
8.6227	2.5	900	12.9157	0.3359	0.2878
3.2961	2.78	1000	13.5723	0.3246	0.2959
3.2961	3.06	1100	13.1047	0.3395	0.2893
3.2961	3.33	1200	14.0601	0.3199	0.2866
3.2961	3.61	1300	14.0842	0.3263	0.2965
3.2961	3.89	1400	14.2522	0.3144	0.2827
2.1563	4.17	1500	13.7890	0.3395	0.2904
2.1563	4.44	1600	14.3511	0.3280	0.2918
2.1563	4.72	1700	15.1711	0.3292	0.2792
2.1563	5.0	1800	15.7697	0.3151	0.2757
2.1563	5.28	1900	15.5149	0.3204	0.2941
1.5208	5.56	2000	15.3098	0.3181	0.2967
1.5208	5.83	2100	14.7872	0.3418	0.3052
1.5208	6.11	2200	15.5063	0.3214	0.2953
1.5208	6.39	2300	15.8674	0.3251	0.2960
1.5208	6.67	2400	16.2428	0.3215	0.2983
1.1778	6.94	2500	15.8196	0.3265	0.3121
1.1778	7.22	2600	16.2186	0.3169	0.2915
1.1778	7.5	2700	16.1006	0.3221	0.3010
1.1778	7.78	2800	15.8025	0.3398	0.3043
1.1778	8.06	2900	15.8275	0.3290	0.3066
0.9182	8.33	3000	16.9089	0.3173	0.3021
0.9182	8.61	3100	16.2800	0.3434	0.3066
0.9182	8.89	3200	16.4016	0.3300	0.3142
0.9182	9.17	3300	17.1270	0.3069	0.3016
0.9182	9.44	3400	16.0886	0.3334	0.2990
0.7575	9.72	3500	17.9885	0.3044	0.2897
0.7575	10.0	3600	16.9147	0.3344	0.3143
0.7575	10.28	3700	17.0150	0.3239	0.3063
0.7575	10.56	3800	17.2972	0.3188	0.3044
0.7575	10.83	3900	17.1954	0.3156	0.2978
0.654	11.11	4000	16.5796	0.3358	0.3079
0.654	11.39	4100	17.8000	0.3253	0.2998
0.654	11.67	4200	17.0779	0.3261	0.3053
0.654	11.94	4300	17.4166	0.3129	0.3022
0.654	12.22	4400	17.3001	0.3145	0.3053
0.5631	12.5	4500	18.0636	0.3119	0.3024
0.5631	12.78	4600	17.2984	0.3226	0.3088
0.5631	13.06	4700	16.9070	0.3382	0.3150
0.5631	13.33	4800	17.3121	0.3279	0.3142
0.5631	13.61	4900	17.1523	0.3296	0.3179
0.5282	13.89	5000	17.8192	0.3126	0.3031
0.5282	14.17	5100	16.7179	0.3306	0.3117
0.5282	14.44	5200	17.9113	0.3191	0.3102
0.5282	14.72	5300	16.8577	0.3304	0.3121
0.5282	15.0	5400	18.0535	0.3160	0.3061
0.4804	15.28	5500	17.8274	0.3169	0.3059
0.4804	15.56	5600	17.0363	0.3325	0.3193
0.4804	15.83	5700	16.8001	0.3331	0.3186
0.4804	16.11	5800	17.4191	0.3242	0.3143
0.4804	16.39	5900	16.8495	0.3420	0.3263
0.4495	16.67	6000	16.8531	0.3397	0.3189
0.4495	16.94	6100	17.4010	0.3289	0.3167
0.4495	17.22	6200	16.3403	0.3474	0.3284
0.4495	17.5	6300	16.8162	0.3415	0.3272
0.4495	17.78	6400	17.3864	0.3340	0.3198
0.4209	18.06	6500	17.6548	0.3235	0.3126
0.4209	18.33	6600	16.0579	0.3551	0.3288
0.4209	18.61	6700	15.8394	0.3599	0.3361
0.4209	18.89	6800	16.9152	0.3349	0.3180
0.4209	19.17	6900	16.2478	0.3534	0.3286
0.3953	19.44	7000	16.8572	0.3343	0.3222
0.3953	19.72	7100	16.4133	0.3458	0.3291
0.3953	20.0	7200	15.6227	0.3542	0.3309
0.3953	20.28	7300	16.2866	0.3487	0.3271
0.3953	20.56	7400	16.6866	0.3472	0.3231
0.378	20.83	7500	15.9135	0.3586	0.3359
0.378	21.11	7600	16.4220	0.3483	0.3240
0.378	21.39	7700	15.9214	0.3585	0.3380
0.378	21.67	7800	16.0507	0.3502	0.3336
0.378	21.94	7900	17.1391	0.3333	0.3229
0.3651	22.22	8000	16.5540	0.3449	0.3282
0.3651	22.5	8100	16.2101	0.3501	0.3253
0.3651	22.78	8200	16.1821	0.3515	0.3351
0.3651	23.06	8300	17.2145	0.3306	0.3218
0.3651	23.33	8400	16.1442	0.3491	0.3334
0.3576	23.61	8500	16.1359	0.3492	0.3310
0.3576	23.89	8600	16.8213	0.3366	0.3271
0.3576	24.17	8700	16.4038	0.3450	0.3328
0.3576	24.44	8800	16.0881	0.3521	0.3285
0.3576	24.72	8900	15.9137	0.3595	0.3379
0.3407	25.0	9000	16.6534	0.3392	0.3341
0.3407	25.28	9100	16.4548	0.3450	0.3307
0.3407	25.56	9200	16.3928	0.3484	0.3288
0.3407	25.83	9300	16.4631	0.3471	0.3345
0.3407	26.11	9400	16.5766	0.3465	0.3315
0.3372	26.39	9500	16.4303	0.3479	0.3333
0.3372	26.67	9600	16.3788	0.3493	0.3347
0.3372	26.94	9700	16.6492	0.3441	0.3304
0.3372	27.22	9800	16.2894	0.3520	0.3365
0.3372	27.5	9900	16.6262	0.3445	0.3306
0.3302	27.78	10000	16.5817	0.3461	0.3344
0.3302	28.06	10100	16.6601	0.3464	0.3349
0.3302	28.33	10200	16.4713	0.3492	0.3364
0.3302	28.61	10300	16.4882	0.3478	0.3366
0.3302	28.89	10400	16.3544	0.3502	0.3376
0.3284	29.17	10500	16.6563	0.3451	0.3336
0.3284	29.44	10600	16.5782	0.3460	0.3315
0.3284	29.72	10700	16.6588	0.3440	0.3319
0.3284	30.0	10800	16.6549	0.3446	0.3337

Framework versions

Transformers 4.33.3
Pytorch 2.1.1+cu121
Datasets 2.14.5
Tokenizers 0.13.3

haryoaw
/

scenario-KD-PO-MSV-EN-EN-D2_data-en-massive_all_1_155

scenario-KD-PO-MSV-EN-EN-D2_data-en-massive_all_1_155

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for haryoaw/scenario-KD-PO-MSV-EN-EN-D2_data-en-massive_all_1_155

Evaluation results