yesj1234's picture
Upload folder using huggingface_hub
980154b
metadata
language:
  - ko
  - en
base_model: facebook/mbart-large-50-many-to-many-mmt
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: ko-en_mbartLarge_exp5p
    results: []

ko-en_mbartLarge_exp5p

This model is a fine-tuned version of facebook/mbart-large-50-many-to-many-mmt on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.2328
  • Bleu: 26.5495
  • Gen Len: 18.4213

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • total_eval_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
1.6447 0.46 1000 1.5338 20.0927 18.3986
1.4737 0.93 2000 1.4057 22.6168 18.5462
1.3708 1.39 3000 1.3645 23.158 18.5132
1.3357 1.86 4000 1.3166 24.2178 18.4343
1.2274 2.32 5000 1.2854 24.8105 18.4761
1.2113 2.78 6000 1.2622 25.4518 18.2672
1.1392 3.25 7000 1.2540 25.6184 18.4032
1.125 3.71 8000 1.2401 25.3848 18.3781
1.0423 4.18 9000 1.2354 25.9776 18.3387
1.011 4.64 10000 1.2418 26.1619 18.4858
0.9493 5.1 11000 1.2616 25.6398 18.2273
0.888 5.57 12000 1.2328 26.5446 18.438
0.8648 6.03 13000 1.2618 26.0371 18.4074
0.776 6.5 14000 1.2669 26.0043 18.4629
0.7856 6.96 15000 1.2592 26.2716 18.403
0.6997 7.42 16000 1.3154 25.7842 18.3693

Framework versions

  • Transformers 4.34.0
  • Pytorch 2.1.0+cu121
  • Datasets 2.14.5
  • Tokenizers 0.14.1