alban12
/

nllb-200-distilled-600M-mt-finetuned-zindi-dyu-to-fr

text2text-generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

nllb-200-distilled-600M-mt-finetuned-zindi-dyu-to-fr / README.md

alban12's picture

Training complete

23b383f verified 3 months ago

|

history blame contribute delete

2.64 kB

	---
	library_name: transformers
	license: cc-by-nc-4.0
	base_model: facebook/nllb-200-distilled-600M
	tags:
	- translation
	- generated_from_trainer
	metrics:
	- bleu
	model-index:
	- name: nllb-200-distilled-600M-mt-finetuned-zindi-dyu-to-fr
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# nllb-200-distilled-600M-mt-finetuned-zindi-dyu-to-fr

	This model is a fine-tuned version of [facebook/nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 2.2584
	- Bleu: 6.4075

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 64
	- eval_batch_size: 128
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 3
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Bleu \|
	\|:-------------:\|:------:\|:----:\|:---------------:\|:------:\|
	\| 3.1707 \| 0.1575 \| 20 \| 2.7356 \| 4.8084 \|
	\| 2.9074 \| 0.3150 \| 40 \| 2.5883 \| 5.0141 \|
	\| 2.7168 \| 0.4724 \| 60 \| 2.4902 \| 5.5785 \|
	\| 2.6912 \| 0.6299 \| 80 \| 2.4154 \| 5.7743 \|
	\| 2.6062 \| 0.7874 \| 100 \| 2.3742 \| 6.0010 \|
	\| 2.5794 \| 0.9449 \| 120 \| 2.3480 \| 6.1354 \|
	\| 2.4634 \| 1.1024 \| 140 \| 2.3314 \| 5.9899 \|
	\| 2.5055 \| 1.2598 \| 160 \| 2.3167 \| 6.1080 \|
	\| 2.5062 \| 1.4173 \| 180 \| 2.3032 \| 6.3784 \|
	\| 2.4771 \| 1.5748 \| 200 \| 2.2944 \| 6.4510 \|
	\| 2.4284 \| 1.7323 \| 220 \| 2.2854 \| 6.2883 \|
	\| 2.4423 \| 1.8898 \| 240 \| 2.2783 \| 6.5036 \|
	\| 2.3202 \| 2.0472 \| 260 \| 2.2730 \| 6.4039 \|
	\| 2.3855 \| 2.2047 \| 280 \| 2.2701 \| 6.2921 \|
	\| 2.4292 \| 2.3622 \| 300 \| 2.2658 \| 6.3025 \|
	\| 2.3678 \| 2.5197 \| 320 \| 2.2626 \| 6.2881 \|
	\| 2.4158 \| 2.6772 \| 340 \| 2.2600 \| 6.3684 \|
	\| 2.351 \| 2.8346 \| 360 \| 2.2588 \| 6.2852 \|
	\| 2.3755 \| 2.9921 \| 380 \| 2.2584 \| 6.2819 \|


	### Framework versions

	- Transformers 4.44.2
	- Pytorch 2.4.0+cu121
	- Datasets 2.18.0
	- Tokenizers 0.19.1