scenario-NON-KD-PO-COPY-D2_data-AmazonScience_massive_all_1_166

This model is a fine-tuned version of haryoaw/scenario-MDBT-TCR-MSV-CL on the massive dataset. It achieves the following results on the evaluation set:

Loss: 1.5710
Accuracy: 0.8548
F1: 0.8320

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 32
seed: 66
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
0.5707	0.2672	5000	0.7168	0.8151	0.7719
0.4302	0.5344	10000	0.6935	0.8335	0.7975
0.3537	0.8017	15000	0.6642	0.8438	0.8140
0.2373	1.0689	20000	0.7224	0.8471	0.8252
0.2273	1.3361	25000	0.7724	0.8440	0.8194
0.2202	1.6033	30000	0.7825	0.8417	0.8160
0.2054	1.8706	35000	0.7538	0.8528	0.8280
0.1381	2.1378	40000	0.8520	0.8517	0.8304
0.142	2.4050	45000	0.8586	0.8494	0.8278
0.1409	2.6722	50000	0.8778	0.8495	0.8210
0.1406	2.9394	55000	0.8792	0.8501	0.8277
0.1036	3.2067	60000	0.9885	0.8481	0.8247
0.1069	3.4739	65000	0.9740	0.8486	0.8232
0.111	3.7411	70000	0.9566	0.8513	0.8285
0.092	4.0083	75000	0.9918	0.8539	0.8327
0.077	4.2756	80000	1.0661	0.8540	0.8324
0.0783	4.5428	85000	1.1273	0.8515	0.8267
0.0799	4.8100	90000	1.0931	0.8507	0.8267
0.0521	5.0772	95000	1.2091	0.8510	0.8272
0.0566	5.3444	100000	1.2432	0.8508	0.8279
0.061	5.6117	105000	1.2415	0.8529	0.8274
0.0557	5.8789	110000	1.2190	0.8540	0.8299
0.0463	6.1461	115000	1.3008	0.8528	0.8265
0.0449	6.4133	120000	1.3608	0.8520	0.8295
0.0454	6.6806	125000	1.3160	0.8539	0.8304
0.0449	6.9478	130000	1.3162	0.8548	0.8329
0.0352	7.2150	135000	1.3967	0.8534	0.8293
0.03	7.4822	140000	1.3989	0.8545	0.8319
0.0357	7.7495	145000	1.4052	0.8525	0.8279
0.0285	8.0167	150000	1.4544	0.8534	0.8301
0.0247	8.2839	155000	1.4825	0.8535	0.8293
0.0266	8.5511	160000	1.5078	0.8546	0.8324
0.0237	8.8183	165000	1.5189	0.8545	0.8322
0.0183	9.0856	170000	1.5705	0.8531	0.8309
0.0175	9.3528	175000	1.5564	0.8538	0.8307
0.0216	9.6200	180000	1.5783	0.8549	0.8321
0.0126	9.8872	185000	1.5710	0.8548	0.8320

Framework versions

Transformers 4.44.2
Pytorch 2.1.1+cu121
Datasets 2.14.5
Tokenizers 0.19.1

haryoaw
/

scenario-NON-KD-PO-COPY-D2_data-AmazonScience_massive_all_1_166

scenario-NON-KD-PO-COPY-D2_data-AmazonScience_massive_all_1_166

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for haryoaw/scenario-NON-KD-PO-COPY-D2_data-AmazonScience_massive_all_1_166

Evaluation results