lapp0
/

flan-t5-small-query-expansion-merged

Text2Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

flan-t5-small-query-expansion-merged / README.md

lapp0's picture

End of training

5f5092c verified 8 months ago

|

history blame contribute delete

3.19 kB

	---
	license: apache-2.0
	base_model: google/flan-t5-small
	tags:
	- generated_from_trainer
	metrics:
	- rouge
	model-index:
	- name: flan-t5-small-query-expansion-merged
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# flan-t5-small-query-expansion-merged

	This model is a fine-tuned version of [google/flan-t5-small](https://huggingface.co/google/flan-t5-small) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.0729
	- Rouge1: 88.0902
	- Rouge2: 86.3492
	- Rougel: 87.7337
	- Rougelsum: 87.9824
	- Gen Len: 18.3077

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.001
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_ratio: 0.05
	- num_epochs: 16

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \| Gen Len \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|:-------:\|:-------:\|:-------:\|:---------:\|:-------:\|
	\| 0.6406 \| 1.0 \| 3377 \| 0.6768 \| 63.9968 \| 45.7612 \| 57.8086 \| 61.5311 \| 18.3873 \|
	\| 0.7793 \| 2.0 \| 6754 \| 0.5605 \| 67.0163 \| 49.7255 \| 61.2364 \| 64.6925 \| 18.3231 \|
	\| 1.0244 \| 3.0 \| 10131 \| 0.4842 \| 67.8219 \| 51.5119 \| 62.3029 \| 65.6804 \| 18.2080 \|
	\| 0.5659 \| 4.0 \| 13508 \| 0.4397 \| 69.1529 \| 53.8002 \| 64.4153 \| 67.2391 \| 18.3712 \|
	\| 0.7296 \| 5.0 \| 16885 \| 0.3969 \| 70.5914 \| 56.0644 \| 66.0627 \| 68.576 \| 18.1605 \|
	\| 0.7259 \| 6.0 \| 20262 \| 0.3626 \| 70.8523 \| 56.4451 \| 66.252 \| 69.1099 \| 18.3231 \|
	\| 0.6528 \| 7.0 \| 23639 \| 0.3237 \| 73.073 \| 59.6605 \| 68.7564 \| 71.3906 \| 18.2966 \|
	\| 0.5374 \| 8.0 \| 27016 \| 0.2677 \| 74.5797 \| 62.7906 \| 70.8802 \| 73.0946 \| 18.2812 \|
	\| 0.3949 \| 9.0 \| 30393 \| 0.2195 \| 77.0612 \| 66.8027 \| 73.9263 \| 75.8907 \| 18.2763 \|
	\| 0.3018 \| 10.0 \| 33770 \| 0.1636 \| 79.9678 \| 71.998 \| 77.5129 \| 78.9566 \| 18.2394 \|
	\| 0.2242 \| 11.0 \| 37147 \| 0.1276 \| 82.9401 \| 77.1969 \| 81.2458 \| 82.3421 \| 18.2924 \|
	\| 0.1141 \| 12.0 \| 40524 \| 0.0940 \| 85.6963 \| 81.8712 \| 84.6628 \| 85.3014 \| 18.3105 \|
	\| 0.087 \| 13.0 \| 43901 \| 0.0816 \| 86.9817 \| 84.3464 \| 86.2565 \| 86.7104 \| 18.3070 \|
	\| 0.0375 \| 14.0 \| 47278 \| 0.0739 \| 87.9019 \| 85.9691 \| 87.4218 \| 87.7412 \| 18.3022 \|
	\| 0.0356 \| 15.0 \| 50655 \| 0.0726 \| 88.0522 \| 86.2944 \| 87.6779 \| 87.9371 \| 18.3015 \|
	\| 0.0302 \| 16.0 \| 54032 \| 0.0729 \| 88.0902 \| 86.3492 \| 87.7337 \| 87.9824 \| 18.3077 \|


	### Framework versions

	- Transformers 4.38.2
	- Pytorch 2.2.0+cu121
	- Datasets 2.18.0
	- Tokenizers 0.15.2