Image_Captioner

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.0923
Rouge1: 25.0369
Rouge2: 10.1572
Rougel: 21.5244
Rougelsum: 24.0775
Gen Len: 18.9946

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
0.253	1.0	836	0.1372	29.3958	12.2981	25.5129	27.9289	19.0
0.1361	2.0	1672	0.1151	25.8361	12.2894	23.7346	25.47	19.0
0.115	3.0	2508	0.1037	25.1859	11.9032	23.1038	24.8338	19.0
0.1027	4.0	3344	0.0942	26.0345	12.0324	23.4843	25.5426	19.0
0.0873	5.0	4180	0.0864	26.1657	11.685	23.6563	25.6247	19.0
0.0742	6.0	5016	0.0794	24.3621	10.5113	21.7192	23.8253	19.0
0.0646	7.0	5852	0.0740	24.711	11.194	22.2089	24.1793	19.0
0.0542	8.0	6688	0.0690	25.0339	10.8651	22.171	24.4106	19.0
0.046	9.0	7524	0.0650	25.0982	11.8399	22.701	24.623	18.9987
0.0386	10.0	8360	0.0623	26.2563	10.4715	22.5319	25.1412	18.9987
0.0317	11.0	9196	0.0591	26.4001	11.8031	23.1653	25.2856	18.9919
0.0273	12.0	10032	0.0587	25.6521	11.0174	22.7327	24.9068	18.9879
0.0231	13.0	10868	0.0583	26.7035	11.2021	23.0121	25.6384	18.9946
0.0195	14.0	11704	0.0592	25.5747	10.7424	22.3673	24.6944	19.0
0.0167	15.0	12540	0.0608	25.3022	10.163	21.9556	24.3587	18.9596
0.0142	16.0	13376	0.0614	25.0496	10.0656	21.7629	24.1094	18.9206
0.0119	17.0	14212	0.0618	26.0112	10.2519	22.1926	24.8873	18.8735
0.0102	18.0	15048	0.0653	25.6183	10.04	22.1136	24.5255	18.9125
0.0086	19.0	15884	0.0671	24.7352	9.6328	21.0675	23.7704	18.8694
0.0076	20.0	16720	0.0693	24.9512	9.6635	21.4761	23.9132	18.9112
0.0067	21.0	17556	0.0708	24.1732	9.158	20.3408	23.029	18.8358
0.0058	22.0	18392	0.0732	24.4503	9.4394	20.8584	23.4242	18.8035
0.0048	23.0	19228	0.0738	24.8844	9.9125	21.3509	23.9336	18.8089
0.0043	24.0	20064	0.0777	25.5401	10.1857	21.8328	24.4294	18.9058
0.0038	25.0	20900	0.0781	24.2235	9.0445	20.4463	23.0001	18.9166
0.0033	26.0	21736	0.0801	25.0127	9.8025	21.3116	23.9683	18.7308
0.0029	27.0	22572	0.0807	24.5765	9.6283	20.9556	23.4559	18.9166
0.0027	28.0	23408	0.0830	24.8389	9.8899	21.4027	23.9416	18.9233
0.0024	29.0	24244	0.0833	25.3695	10.162	21.7865	24.3737	18.7106
0.0022	30.0	25080	0.0832	24.8804	10.0825	21.4621	24.0326	18.9287
0.0021	31.0	25916	0.0853	25.0049	9.7036	21.3664	23.9173	18.9044
0.0019	32.0	26752	0.0855	25.0529	9.4994	21.2781	24.0076	18.9125
0.002	33.0	27588	0.0852	24.8417	9.9376	21.2526	23.8552	18.9031
0.0015	34.0	28424	0.0857	24.6359	9.5179	20.8941	23.4553	18.8937
0.0014	35.0	29260	0.0858	25.1156	10.1869	21.5805	23.9664	18.8156
0.0013	36.0	30096	0.0871	24.739	9.5548	21.15	23.749	18.9219
0.0011	37.0	30932	0.0884	24.774	9.7848	21.2467	23.833	18.9556
0.0011	38.0	31768	0.0889	25.2656	9.9796	21.517	24.1836	18.9462
0.0011	39.0	32604	0.0895	24.6627	9.3783	20.9288	23.5835	18.9704
0.001	40.0	33440	0.0906	25.1326	9.814	21.3593	24.0816	18.9260
0.0009	41.0	34276	0.0900	25.6889	10.3712	22.0588	24.695	18.9731
0.0008	42.0	35112	0.0911	24.6819	9.8307	21.1335	23.7053	18.9071
0.0008	43.0	35948	0.0905	24.4835	9.7292	21.017	23.5027	18.9623
0.0007	44.0	36784	0.0910	24.8203	9.5875	21.245	23.7718	18.9825
0.0007	45.0	37620	0.0914	25.1212	10.1024	21.6215	24.1061	18.9771
0.0006	46.0	38456	0.0914	25.1636	9.8127	21.5343	24.13	18.9475
0.0006	47.0	39292	0.0915	24.866	9.8427	21.3531	23.8643	18.9394
0.0006	48.0	40128	0.0916	25.064	10.049	21.5198	24.1158	18.9731
0.0005	49.0	40964	0.0923	24.8424	9.9718	21.3263	23.9031	18.9933
0.0005	50.0	41800	0.0923	25.0369	10.1572	21.5244	24.0775	18.9946

Framework versions

Transformers 4.37.1
Pytorch 1.13.1+cu117
Datasets 2.15.0
Tokenizers 0.15.1

ChayanM
/

Image_Captioner

Image_Captioner

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results