scenario-NON-KD-PR-COPY-CDF-ALL-D2_data-cardiffnlp_tweet_sentiment_multilingual_

This model is a fine-tuned version of microsoft/mdeberta-v3-base on the tweet_sentiment_multilingual dataset. It achieves the following results on the evaluation set:

Loss: 4.7108
Accuracy: 0.5475
F1: 0.5469

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 32
seed: 66
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
1.0791	1.0870	500	1.0368	0.4660	0.4346
0.9866	2.1739	1000	1.0697	0.5116	0.4871
0.891	3.2609	1500	1.0182	0.5351	0.5315
0.7817	4.3478	2000	1.0793	0.5382	0.5268
0.6754	5.4348	2500	1.2254	0.5440	0.5347
0.5735	6.5217	3000	1.3490	0.5490	0.5419
0.4804	7.6087	3500	1.4240	0.5374	0.5365
0.402	8.6957	4000	1.6744	0.5409	0.5346
0.3338	9.7826	4500	1.8045	0.5293	0.5303
0.2826	10.8696	5000	1.8731	0.5340	0.5340
0.239	11.9565	5500	2.0811	0.5336	0.5331
0.1932	13.0435	6000	2.5003	0.5374	0.5358
0.172	14.1304	6500	2.3698	0.5374	0.5364
0.1508	15.2174	7000	2.9410	0.5502	0.5455
0.1364	16.3043	7500	2.9157	0.5463	0.5472
0.1274	17.3913	8000	2.8807	0.5417	0.5329
0.1187	18.4783	8500	3.1515	0.5332	0.5328
0.1021	19.5652	9000	3.1270	0.5355	0.5363
0.0998	20.6522	9500	3.1811	0.5521	0.5512
0.0919	21.7391	10000	3.0586	0.5409	0.5327
0.0904	22.8261	10500	3.1029	0.5382	0.5382
0.0785	23.9130	11000	3.3520	0.5405	0.5389
0.0695	25.0	11500	2.8631	0.5475	0.5443
0.0683	26.0870	12000	3.3984	0.5467	0.5462
0.0634	27.1739	12500	3.3375	0.5521	0.5520
0.0554	28.2609	13000	3.5643	0.5444	0.5442
0.0521	29.3478	13500	3.6555	0.5386	0.5373
0.0468	30.4348	14000	3.8146	0.5498	0.5477
0.0475	31.5217	14500	3.9862	0.5370	0.5374
0.0423	32.6087	15000	3.9440	0.5413	0.5377
0.0417	33.6957	15500	3.9646	0.5405	0.5409
0.0408	34.7826	16000	3.8754	0.5424	0.5433
0.0314	35.8696	16500	4.2460	0.5413	0.5394
0.0344	36.9565	17000	4.2120	0.5455	0.5444
0.0296	38.0435	17500	4.4753	0.5448	0.5451
0.0308	39.1304	18000	4.1944	0.5494	0.5491
0.0225	40.2174	18500	4.4062	0.5486	0.5466
0.0284	41.3043	19000	4.1900	0.5444	0.5428
0.0191	42.3913	19500	4.5725	0.5444	0.5441
0.0202	43.4783	20000	4.5546	0.5502	0.5492
0.019	44.5652	20500	4.6947	0.5463	0.5465
0.02	45.6522	21000	4.4766	0.5471	0.5462
0.0182	46.7391	21500	4.4498	0.5490	0.5480
0.0131	47.8261	22000	4.5762	0.5490	0.5484
0.0105	48.9130	22500	4.7128	0.5467	0.5464
0.015	50.0	23000	4.7108	0.5475	0.5469

Framework versions

Transformers 4.44.2
Pytorch 2.1.1+cu121
Datasets 2.14.5
Tokenizers 0.19.1

haryoaw
/

scenario-NON-KD-PR-COPY-CDF-ALL-D2_data-cardiffnlp_tweet_sentiment_multilingual_

scenario-NON-KD-PR-COPY-CDF-ALL-D2_data-cardiffnlp_tweet_sentiment_multilingual_

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for haryoaw/scenario-NON-KD-PR-COPY-CDF-ALL-D2_data-cardiffnlp_tweet_sentiment_multilingual_

Evaluation results