scenario-KD-PR-MSV-D2_data-cl-massive_all_1_155

This model is a fine-tuned version of haryoaw/scenario-MDBT-TCR_data-cl-massive_all_1_1 on the massive dataset. It achieves the following results on the evaluation set:

Loss: 2.5659
Accuracy: 0.6234
F1: 0.5935

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 32
seed: 55
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 30

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
1.3685	0.56	5000	2.3825	0.6101	0.5721
1.1362	1.11	10000	2.3303	0.6277	0.5814
1.1011	1.67	15000	2.2524	0.6458	0.5921
1.0061	2.22	20000	2.3519	0.6330	0.5861
1.004	2.78	25000	2.3514	0.6260	0.5797
0.9543	3.33	30000	2.4451	0.6188	0.5833
0.957	3.89	35000	2.3458	0.6352	0.5832
0.9182	4.45	40000	2.4666	0.6177	0.5823
0.9207	5.0	45000	2.4348	0.6297	0.5832
0.8814	5.56	50000	2.5433	0.6051	0.5682
0.8671	6.11	55000	2.5489	0.6119	0.5763
0.8732	6.67	60000	2.4481	0.6266	0.5702
0.8564	7.23	65000	2.5152	0.6242	0.5843
0.8639	7.78	70000	2.5782	0.6095	0.5733
0.8547	8.34	75000	2.5712	0.6124	0.5778
0.8539	8.89	80000	2.5092	0.6143	0.5676
0.8379	9.45	85000	2.5264	0.6182	0.5763
0.8388	10.0	90000	2.5294	0.6244	0.5873
0.8348	10.56	95000	2.6556	0.6065	0.5799
0.8216	11.12	100000	2.5251	0.6213	0.5724
0.8285	11.67	105000	2.5312	0.6201	0.5772
0.8187	12.23	110000	2.6448	0.6051	0.5773
0.8244	12.78	115000	2.5533	0.6168	0.5823
0.8135	13.34	120000	2.5669	0.6161	0.5743
0.8185	13.9	125000	2.5724	0.6178	0.5839
0.8147	14.45	130000	2.5826	0.6152	0.5770
0.8122	15.01	135000	2.5439	0.6247	0.5838
0.8045	15.56	140000	2.5643	0.6169	0.5721
0.7994	16.12	145000	2.5887	0.6196	0.5782
0.8002	16.67	150000	2.5524	0.6195	0.5845
0.7976	17.23	155000	2.6154	0.6112	0.5819
0.798	17.79	160000	2.5928	0.6148	0.5824
0.7995	18.34	165000	2.6006	0.6140	0.5811
0.801	18.9	170000	2.5610	0.6212	0.5863
0.7937	19.45	175000	2.5948	0.6145	0.5873
0.7965	20.01	180000	2.6013	0.6136	0.5859
0.7911	20.56	185000	2.6488	0.6106	0.5906
0.7873	21.12	190000	2.6141	0.6134	0.5810
0.7931	21.68	195000	2.6865	0.6010	0.5795
0.7866	22.23	200000	2.5861	0.6160	0.5810
0.7867	22.79	205000	2.5334	0.6224	0.5886
0.7841	23.34	210000	2.5656	0.6272	0.5909
0.7897	23.9	215000	2.4915	0.6307	0.5949
0.7857	24.46	220000	2.6083	0.6166	0.5886
0.7841	25.01	225000	2.5430	0.6262	0.5941
0.7842	25.57	230000	2.6212	0.6123	0.5852
0.7816	26.12	235000	2.6234	0.6127	0.5934
0.7818	26.68	240000	2.6039	0.6196	0.5945
0.7809	27.23	245000	2.6044	0.6180	0.5937
0.7822	27.79	250000	2.5414	0.6254	0.5931
0.7835	28.35	255000	2.5310	0.6263	0.5910
0.781	28.9	260000	2.5196	0.6291	0.5974
0.7777	29.46	265000	2.5659	0.6234	0.5935

Framework versions

Transformers 4.33.3
Pytorch 2.1.1+cu121
Datasets 2.14.5
Tokenizers 0.13.3

haryoaw
/

scenario-KD-PR-MSV-D2_data-cl-massive_all_1_155

scenario-KD-PR-MSV-D2_data-cl-massive_all_1_155

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for haryoaw/scenario-KD-PR-MSV-D2_data-cl-massive_all_1_155

Evaluation results