samzirbo
/

baseline

Text2Text Generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

baseline / README.md

samzirbo's picture

End of training

74b1183 verified 6 months ago

|

history blame contribute delete

3.02 kB

	---
	base_model: samzirbo/mT5.en-es.pretrained
	tags:
	- generated_from_trainer
	metrics:
	- bleu
	model-index:
	- name: baseline
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# baseline

	This model is a fine-tuned version of [samzirbo/mT5.en-es.pretrained](https://huggingface.co/samzirbo/mT5.en-es.pretrained) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.1724
	- Bleu: 43.677
	- Meteor: 0.6901
	- Chrf++: 62.5868

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0005
	- train_batch_size: 64
	- eval_batch_size: 64
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_steps: 1000
	- training_steps: 50000

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Bleu \| Meteor \| Chrf++ \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|:-------:\|:------:\|:-------:\|
	\| 4.3403 \| 0.26 \| 2500 \| 2.0224 \| 27.59 \| 0.5546 \| 49.0181 \|
	\| 2.4264 \| 0.53 \| 5000 \| 1.7329 \| 32.4582 \| 0.6023 \| 53.8983 \|
	\| 2.1747 \| 0.79 \| 7500 \| 1.5850 \| 35.9783 \| 0.6246 \| 56.295 \|
	\| 2.0285 \| 1.05 \| 10000 \| 1.5016 \| 37.3015 \| 0.638 \| 57.5591 \|
	\| 1.9104 \| 1.32 \| 12500 \| 1.4356 \| 38.832 \| 0.6501 \| 58.6692 \|
	\| 1.8547 \| 1.58 \| 15000 \| 1.3784 \| 39.7112 \| 0.6593 \| 59.4218 \|
	\| 1.8013 \| 1.84 \| 17500 \| 1.3481 \| 39.9137 \| 0.6608 \| 59.7434 \|
	\| 1.7372 \| 2.11 \| 20000 \| 1.3070 \| 40.8569 \| 0.6679 \| 60.4092 \|
	\| 1.6845 \| 2.37 \| 22500 \| 1.2847 \| 41.5254 \| 0.6721 \| 60.8743 \|
	\| 1.6611 \| 2.64 \| 25000 \| 1.2574 \| 42.0492 \| 0.6767 \| 61.2287 \|
	\| 1.6382 \| 2.9 \| 27500 \| 1.2372 \| 42.2626 \| 0.6806 \| 61.5161 \|
	\| 1.595 \| 3.16 \| 30000 \| 1.2220 \| 42.827 \| 0.6835 \| 61.9015 \|
	\| 1.5645 \| 3.43 \| 32500 \| 1.2088 \| 42.909 \| 0.6828 \| 61.8832 \|
	\| 1.5557 \| 3.69 \| 35000 \| 1.1981 \| 43.2386 \| 0.6852 \| 62.1239 \|
	\| 1.5473 \| 3.95 \| 37500 \| 1.1862 \| 43.4076 \| 0.6866 \| 62.3625 \|
	\| 1.5147 \| 4.22 \| 40000 \| 1.1797 \| 43.5469 \| 0.6876 \| 62.3958 \|
	\| 1.5089 \| 4.48 \| 42500 \| 1.1765 \| 43.5486 \| 0.689 \| 62.5208 \|
	\| 1.5032 \| 4.74 \| 45000 \| 1.1738 \| 43.6415 \| 0.6893 \| 62.5473 \|
	\| 1.4998 \| 5.01 \| 47500 \| 1.1724 \| 43.6758 \| 0.6898 \| 62.581 \|
	\| 1.4905 \| 5.27 \| 50000 \| 1.1724 \| 43.677 \| 0.6901 \| 62.5868 \|


	### Framework versions

	- Transformers 4.38.0
	- Pytorch 2.2.1+cu121
	- Datasets 2.19.1
	- Tokenizers 0.15.2