machine_translation / README.md
gyr66's picture
Update README.md
6ac8623
metadata
language:
  - en
  - zh
metrics:
  - sacrebleu
pipeline_tag: translation
base_model: facebook/mbart-large-cc25

eval

This model is a fine-tuned version of facebook/mbart-large-cc25 on IWSLT14 En-Zh dataset.

It achieves the following results on the evaluation set:

  • eval_loss: 3.8405
  • eval_bleu: 3.5173
  • eval_gen_len: 21.5826

It achieves the following results on the test set:

  • test_loss: 3.8337
  • test_bleu: 3.277
  • test_gen_len: 21.6287

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 7
  • num_epochs: 9

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.0.1+cu117
  • Datasets 2.15.0
  • Tokenizers 0.15.0