learn3r
/

longt5_xl_summ_screen_bp_only_30

Text2Text Generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

longt5_xl_summ_screen_bp_only_30 / README.md

learn3r's picture

End of training

e811081 about 1 year ago

|

history blame contribute delete

No virus

3.61 kB

	---
	base_model: /exports/eddie/scratch/s1970716/models/summarization/longt5_xl_summ_screen_bp_only/checkpoint-210
	tags:
	- generated_from_trainer
	datasets:
	- learn3r/summ_screen_fd_bp
	metrics:
	- rouge
	model-index:
	- name: longt5_xl_summ_screen_bp_only_30
	results:
	- task:
	name: Summarization
	type: summarization
	dataset:
	name: learn3r/summ_screen_fd_bp
	type: learn3r/summ_screen_fd_bp
	metrics:
	- name: Rouge1
	type: rouge
	value: 40.4388
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# longt5_xl_summ_screen_bp_only_30

	This model is a fine-tuned version of [/exports/eddie/scratch/s1970716/models/summarization/longt5_xl_summ_screen_bp_only/checkpoint-210](https://huggingface.co//exports/eddie/scratch/s1970716/models/summarization/longt5_xl_summ_screen_bp_only/checkpoint-210) on the learn3r/summ_screen_fd_bp dataset.
	It achieves the following results on the evaluation set:
	- Loss: 2.2376
	- Rouge1: 40.4388
	- Rouge2: 16.4662
	- Rougel: 28.0771
	- Rougelsum: 38.3405
	- Gen Len: 246.7396

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0005
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- gradient_accumulation_steps: 32
	- total_train_batch_size: 256
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: constant
	- num_epochs: 15.0

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \| Gen Len \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:-------:\|:-------:\|:-------:\|:---------:\|:--------:\|
	\| 0.324 \| 0.97 \| 14 \| 2.2376 \| 40.4388 \| 16.4662 \| 28.0771 \| 38.3405 \| 246.7396 \|
	\| 0.2707 \| 1.95 \| 28 \| 2.3204 \| 40.2873 \| 16.7641 \| 27.3895 \| 38.2689 \| 307.3787 \|
	\| 0.2217 \| 2.99 \| 43 \| 2.5281 \| 31.9916 \| 13.8136 \| 22.1895 \| 30.623 \| 501.9320 \|
	\| 0.1776 \| 3.97 \| 57 \| 2.7530 \| 31.7535 \| 13.8852 \| 22.8653 \| 30.3796 \| 489.6183 \|
	\| 0.1424 \| 4.94 \| 71 \| 2.6578 \| 32.117 \| 14.2141 \| 22.3733 \| 30.8328 \| 502.1124 \|
	\| 0.1449 \| 5.98 \| 86 \| 2.5508 \| 35.3448 \| 13.8478 \| 24.9044 \| 33.6108 \| 357.3136 \|
	\| 0.1191 \| 6.96 \| 100 \| 3.1622 \| 37.2189 \| 16.0076 \| 25.7011 \| 35.294 \| 408.8669 \|
	\| 0.0879 \| 8.0 \| 115 \| 2.8510 \| 39.8825 \| 16.8073 \| 27.2428 \| 37.9568 \| 318.2278 \|
	\| 0.0899 \| 8.97 \| 129 \| 2.9138 \| 31.7139 \| 13.7066 \| 21.8844 \| 30.5075 \| 500.4053 \|
	\| 0.0656 \| 9.95 \| 143 \| 3.1616 \| 33.055 \| 14.5841 \| 22.5883 \| 31.7565 \| 488.1686 \|
	\| 0.0542 \| 10.99 \| 158 \| 3.3630 \| 43.7514 \| 18.9011 \| 29.9017 \| 41.6887 \| 198.8077 \|
	\| 0.0557 \| 11.97 \| 172 \| 3.3826 \| 42.3089 \| 18.2735 \| 29.0356 \| 40.4154 \| 270.9675 \|
	\| 0.0542 \| 12.94 \| 186 \| 3.4408 \| 40.7691 \| 16.529 \| 28.3999 \| 38.9723 \| 186.7308 \|
	\| 0.0596 \| 13.98 \| 201 \| 3.5253 \| 37.0037 \| 15.9098 \| 25.2808 \| 35.3868 \| 398.4704 \|
	\| 0.0385 \| 14.61 \| 210 \| 3.4990 \| 32.5815 \| 14.2951 \| 22.4501 \| 31.2928 \| 499.3107 \|


	### Framework versions

	- Transformers 4.34.0.dev0
	- Pytorch 2.0.1+cu117
	- Datasets 2.14.5
	- Tokenizers 0.13.3