zera09
/

bart-base-summarize-finetuned

Text2Text Generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

bart-base-summarize-finetuned / README.md

zera09's picture

End of training

d508eb1 verified 5 months ago

|

history blame contribute delete

3.58 kB

	---
	license: apache-2.0
	base_model: facebook/bart-base
	tags:
	- generated_from_trainer
	metrics:
	- rouge
	model-index:
	- name: bart-base-summarize-finetuned
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# bart-base-summarize-finetuned

	This model is a fine-tuned version of [facebook/bart-base](https://huggingface.co/facebook/bart-base) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.3408
	- Rouge1: 79.6622
	- Rouge2: 77.9282
	- Rougel: 79.6654
	- Rougelsum: 79.6384
	- Gen Len: 7.8821

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 64
	- eval_batch_size: 64
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 20
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \| Gen Len \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:-------:\|:-------:\|:-------:\|:---------:\|:-------:\|
	\| No log \| 1.0 \| 62 \| 0.3856 \| 67.6564 \| 65.4045 \| 67.6202 \| 67.6206 \| 6.6825 \|
	\| No log \| 2.0 \| 124 \| 0.3529 \| 70.23 \| 68.4349 \| 70.2289 \| 70.1265 \| 6.5756 \|
	\| No log \| 3.0 \| 186 \| 0.3303 \| 75.4875 \| 73.3149 \| 75.3918 \| 75.3835 \| 7.9808 \|
	\| No log \| 4.0 \| 248 \| 0.3165 \| 76.17 \| 74.0354 \| 76.2341 \| 76.1363 \| 7.4435 \|
	\| No log \| 5.0 \| 310 \| 0.3094 \| 76.9425 \| 75.0561 \| 76.9582 \| 76.8794 \| 7.9567 \|
	\| No log \| 6.0 \| 372 \| 0.3130 \| 78.1808 \| 76.2533 \| 78.1846 \| 78.1377 \| 7.9062 \|
	\| No log \| 7.0 \| 434 \| 0.3081 \| 78.5859 \| 76.7258 \| 78.6782 \| 78.5825 \| 7.6946 \|
	\| No log \| 8.0 \| 496 \| 0.3195 \| 78.8452 \| 76.85 \| 78.8076 \| 78.7562 \| 8.1663 \|
	\| 0.3758 \| 9.0 \| 558 \| 0.3103 \| 78.9204 \| 77.2131 \| 78.9671 \| 78.9562 \| 8.1341 \|
	\| 0.3758 \| 10.0 \| 620 \| 0.3091 \| 78.7793 \| 76.8877 \| 78.7503 \| 78.7031 \| 7.7319 \|
	\| 0.3758 \| 11.0 \| 682 \| 0.3173 \| 79.1693 \| 77.4324 \| 79.2141 \| 79.1671 \| 7.8881 \|
	\| 0.3758 \| 12.0 \| 744 \| 0.3192 \| 79.3653 \| 77.6962 \| 79.4379 \| 79.3547 \| 7.7339 \|
	\| 0.3758 \| 13.0 \| 806 \| 0.3246 \| 79.041 \| 77.1587 \| 79.1201 \| 79.0828 \| 7.8438 \|
	\| 0.3758 \| 14.0 \| 868 \| 0.3312 \| 79.4605 \| 77.7629 \| 79.5227 \| 79.4425 \| 7.8014 \|
	\| 0.3758 \| 15.0 \| 930 \| 0.3300 \| 79.7724 \| 78.167 \| 79.8187 \| 79.799 \| 7.8609 \|
	\| 0.3758 \| 16.0 \| 992 \| 0.3409 \| 79.4618 \| 77.694 \| 79.4758 \| 79.4325 \| 7.8296 \|
	\| 0.14 \| 17.0 \| 1054 \| 0.3436 \| 79.1169 \| 77.3095 \| 79.1082 \| 79.092 \| 8.0302 \|
	\| 0.14 \| 18.0 \| 1116 \| 0.3440 \| 78.9896 \| 77.2319 \| 78.984 \| 78.9472 \| 7.9325 \|
	\| 0.14 \| 19.0 \| 1178 \| 0.3399 \| 79.531 \| 77.8083 \| 79.5489 \| 79.5005 \| 7.871 \|
	\| 0.14 \| 20.0 \| 1240 \| 0.3408 \| 79.6622 \| 77.9282 \| 79.6654 \| 79.6384 \| 7.8821 \|


	### Framework versions

	- Transformers 4.41.1
	- Pytorch 1.13.1+cu117
	- Datasets 2.19.1
	- Tokenizers 0.19.1