roberta-large-sst-2-32-13

This model is a fine-tuned version of roberta-large on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.4497
Accuracy: 0.9375

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 150

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	2	0.6944	0.5
No log	2.0	4	0.6944	0.5
No log	3.0	6	0.6944	0.5
No log	4.0	8	0.6944	0.5
0.7018	5.0	10	0.6944	0.5
0.7018	6.0	12	0.6943	0.5
0.7018	7.0	14	0.6943	0.5
0.7018	8.0	16	0.6942	0.5
0.7018	9.0	18	0.6941	0.5
0.7003	10.0	20	0.6940	0.5
0.7003	11.0	22	0.6939	0.5
0.7003	12.0	24	0.6938	0.5
0.7003	13.0	26	0.6937	0.5
0.7003	14.0	28	0.6936	0.5
0.6964	15.0	30	0.6934	0.5
0.6964	16.0	32	0.6934	0.5
0.6964	17.0	34	0.6933	0.5
0.6964	18.0	36	0.6932	0.5
0.6964	19.0	38	0.6931	0.5
0.7001	20.0	40	0.6931	0.5
0.7001	21.0	42	0.6931	0.5
0.7001	22.0	44	0.6931	0.5
0.7001	23.0	46	0.6931	0.5
0.7001	24.0	48	0.6931	0.5
0.6924	25.0	50	0.6931	0.5
0.6924	26.0	52	0.6931	0.5
0.6924	27.0	54	0.6931	0.5
0.6924	28.0	56	0.6930	0.5
0.6924	29.0	58	0.6930	0.5
0.6985	30.0	60	0.6930	0.5
0.6985	31.0	62	0.6930	0.5
0.6985	32.0	64	0.6929	0.5
0.6985	33.0	66	0.6927	0.5
0.6985	34.0	68	0.6925	0.5
0.6968	35.0	70	0.6924	0.5
0.6968	36.0	72	0.6923	0.5
0.6968	37.0	74	0.6922	0.5
0.6968	38.0	76	0.6922	0.5
0.6968	39.0	78	0.6920	0.5
0.6822	40.0	80	0.6917	0.5
0.6822	41.0	82	0.6916	0.5
0.6822	42.0	84	0.6913	0.5
0.6822	43.0	86	0.6911	0.5
0.6822	44.0	88	0.6910	0.5
0.6907	45.0	90	0.6908	0.5
0.6907	46.0	92	0.6906	0.5
0.6907	47.0	94	0.6905	0.5
0.6907	48.0	96	0.6902	0.5156
0.6907	49.0	98	0.6898	0.5625
0.6822	50.0	100	0.6892	0.5469
0.6822	51.0	102	0.6887	0.5938
0.6822	52.0	104	0.6881	0.5938
0.6822	53.0	106	0.6874	0.6094
0.6822	54.0	108	0.6868	0.6094
0.6744	55.0	110	0.6862	0.5938
0.6744	56.0	112	0.6859	0.5312
0.6744	57.0	114	0.6856	0.5469
0.6744	58.0	116	0.6873	0.5469
0.6744	59.0	118	0.6910	0.5469
0.6401	60.0	120	0.6938	0.5469
0.6401	61.0	122	0.6911	0.5625
0.6401	62.0	124	0.6835	0.5625
0.6401	63.0	126	0.6765	0.5781
0.6401	64.0	128	0.6689	0.5781
0.5823	65.0	130	0.6597	0.6094
0.5823	66.0	132	0.6514	0.625
0.5823	67.0	134	0.6459	0.6406
0.5823	68.0	136	0.6372	0.6562
0.5823	69.0	138	0.6274	0.6562
0.5265	70.0	140	0.6163	0.6875
0.5265	71.0	142	0.6018	0.7188
0.5265	72.0	144	0.5853	0.7812
0.5265	73.0	146	0.5600	0.7812
0.5265	74.0	148	0.5138	0.8125
0.4305	75.0	150	0.4514	0.8594
0.4305	76.0	152	0.3753	0.9219
0.4305	77.0	154	0.3197	0.9375
0.4305	78.0	156	0.2687	0.9375
0.4305	79.0	158	0.2246	0.9531
0.2335	80.0	160	0.2019	0.9219
0.2335	81.0	162	0.1977	0.9219
0.2335	82.0	164	0.1741	0.9375
0.2335	83.0	166	0.1468	0.9375
0.2335	84.0	168	0.1355	0.9688
0.0918	85.0	170	0.1447	0.9688
0.0918	86.0	172	0.1628	0.9688
0.0918	87.0	174	0.2077	0.9531
0.0918	88.0	176	0.2623	0.9375
0.0918	89.0	178	0.2854	0.9375
0.0132	90.0	180	0.3076	0.9375
0.0132	91.0	182	0.2989	0.9375
0.0132	92.0	184	0.2839	0.9531
0.0132	93.0	186	0.2756	0.9531
0.0132	94.0	188	0.2669	0.9531
0.0035	95.0	190	0.2414	0.9531
0.0035	96.0	192	0.2353	0.9375
0.0035	97.0	194	0.2482	0.9531
0.0035	98.0	196	0.2578	0.9375
0.0035	99.0	198	0.2755	0.9375
0.0013	100.0	200	0.2956	0.9375
0.0013	101.0	202	0.3133	0.9531
0.0013	102.0	204	0.3293	0.9531
0.0013	103.0	206	0.3417	0.9531
0.0013	104.0	208	0.3510	0.9531
0.0005	105.0	210	0.3616	0.9531
0.0005	106.0	212	0.3694	0.9531
0.0005	107.0	214	0.3754	0.9531
0.0005	108.0	216	0.3806	0.9531
0.0005	109.0	218	0.3850	0.9531
0.0004	110.0	220	0.3890	0.9531
0.0004	111.0	222	0.3924	0.9531
0.0004	112.0	224	0.3956	0.9531
0.0004	113.0	226	0.3986	0.9531
0.0004	114.0	228	0.4011	0.9531
0.0003	115.0	230	0.4034	0.9531
0.0003	116.0	232	0.4056	0.9531
0.0003	117.0	234	0.4076	0.9531
0.0003	118.0	236	0.4118	0.9531
0.0003	119.0	238	0.4199	0.9531
0.0003	120.0	240	0.4298	0.9375
0.0003	121.0	242	0.4401	0.9375
0.0003	122.0	244	0.4495	0.9375
0.0003	123.0	246	0.4602	0.9375
0.0003	124.0	248	0.4687	0.9375
0.0003	125.0	250	0.4755	0.9375
0.0003	126.0	252	0.4813	0.9375
0.0003	127.0	254	0.4855	0.9375
0.0003	128.0	256	0.4896	0.9375
0.0003	129.0	258	0.4940	0.9375
0.0002	130.0	260	0.4967	0.9375
0.0002	131.0	262	0.4963	0.9375
0.0002	132.0	264	0.4903	0.9375
0.0002	133.0	266	0.4861	0.9375
0.0002	134.0	268	0.4831	0.9375
0.0003	135.0	270	0.4804	0.9375
0.0003	136.0	272	0.4780	0.9375
0.0003	137.0	274	0.4761	0.9375
0.0003	138.0	276	0.4721	0.9375
0.0003	139.0	278	0.4686	0.9375
0.0002	140.0	280	0.4646	0.9375
0.0002	141.0	282	0.4593	0.9375
0.0002	142.0	284	0.4542	0.9375
0.0002	143.0	286	0.4495	0.9375
0.0002	144.0	288	0.4472	0.9375
0.0002	145.0	290	0.4465	0.9375
0.0002	146.0	292	0.4467	0.9375
0.0002	147.0	294	0.4469	0.9375
0.0002	148.0	296	0.4474	0.9375
0.0002	149.0	298	0.4483	0.9375
0.0002	150.0	300	0.4497	0.9375

Framework versions

Transformers 4.32.0.dev0
Pytorch 2.0.1+cu118
Datasets 2.4.0
Tokenizers 0.13.3

simonycl
/

roberta-large-sst-2-32-13

roberta-large-sst-2-32-13

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for simonycl/roberta-large-sst-2-32-13

Evaluation results