flan-t5-small-query-expansion-merged-lr-2e-4-ep-30

This model is a fine-tuned version of google/flan-t5-small on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.0687
Rouge1: 88.0058
Rouge2: 86.0177
Rougel: 87.4622
Rougelsum: 87.8743
Gen Len: 18.3001

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.05
num_epochs: 30

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
0.5903	1.0	3377	0.6694	64.0967	45.4218	57.4907	61.0826	18.3315
0.6513	2.0	6754	0.5487	66.0187	48.1106	59.9204	63.1815	18.2596
0.8114	3.0	10131	0.4678	68.4505	52.0787	63.0805	66.1556	18.2296
0.4854	4.0	13508	0.3981	69.9674	54.7741	64.9602	67.9319	18.2352
0.5574	5.0	16885	0.3512	71.8602	57.3778	67.222	69.9169	18.1424
0.5343	6.0	20262	0.3047	72.9426	59.4383	68.6726	71.1382	18.0677
0.5003	7.0	23639	0.2670	74.6434	62.3906	71.017	73.1537	18.2826
0.4381	8.0	27016	0.2366	75.5879	63.5581	71.9563	74.0976	18.2247
0.4298	9.0	30393	0.2065	77.1535	66.3128	74.04	75.8557	18.1933
0.3524	10.0	33770	0.1877	78.2066	68.6445	75.2292	77.1067	18.2107
0.3374	11.0	37147	0.1650	79.2401	70.0953	76.392	78.1788	18.1982
0.2578	12.0	40524	0.1424	80.3072	72.4561	78.1583	79.5657	18.2659
0.2468	13.0	43901	0.1280	81.8419	75.2485	80.0524	81.2131	18.2401
0.2079	14.0	47278	0.1147	82.7505	76.7955	81.1375	82.1929	18.2212
0.1632	15.0	50655	0.1023	83.7303	78.3819	82.1019	83.1492	18.2826
0.1669	16.0	54032	0.0945	84.5118	79.8875	83.2797	84.1232	18.2561
0.1974	17.0	57409	0.0886	85.5067	81.4914	84.3091	85.178	18.2840
0.1461	18.0	60786	0.0829	85.9375	82.3743	85.025	85.6625	18.2805
0.1262	19.0	64163	0.0797	86.3679	83.1603	85.507	86.0875	18.2722
0.0982	20.0	67540	0.0759	87.215	84.5141	86.4955	86.9934	18.2770
0.087	21.0	70917	0.0726	87.2046	84.548	86.4369	86.9678	18.2924
0.0914	22.0	74294	0.0715	87.7024	85.3997	86.9993	87.4716	18.2882
0.0945	23.0	77671	0.0703	87.8468	85.7513	87.2094	87.6558	18.2896
0.0586	24.0	81048	0.0698	87.883	85.8184	87.3243	87.692	18.2882
0.062	25.0	84425	0.0689	87.9345	85.9142	87.3693	87.7799	18.2875
0.0758	26.0	87802	0.0687	87.9042	85.8727	87.3166	87.7249	18.2903
0.0771	27.0	91179	0.0686	87.989	86.0401	87.4768	87.8379	18.2882
0.0744	28.0	94556	0.0687	88.0227	86.0604	87.4917	87.8992	18.3001
0.0419	29.0	97933	0.0687	88.0058	86.0177	87.4622	87.8743	18.3001
0.0615	30.0	101310	0.0687	88.0058	86.0177	87.4622	87.8743	18.3001

Framework versions

Transformers 4.38.2
Pytorch 2.2.0+cu121
Datasets 2.18.0
Tokenizers 0.15.2

lapp0
/

flan-t5-small-query-expansion-merged-lr-2e-4-ep-30

flan-t5-small-query-expansion-merged-lr-2e-4-ep-30

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for lapp0/flan-t5-small-query-expansion-merged-lr-2e-4-ep-30

Evaluation results