en_he_base / README.md
orendar's picture
update
afdd8c0
|
raw
history blame
No virus
2.37 kB
metadata
language:
  - en
  - he
tags:
  - generated_from_trainer
model-index:
  - name: marian_base
    results: []

marian_base

This model is a fine-tuned version of orendar/en_he_base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6365

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 160
  • eval_batch_size: 160
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 320
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
2.125 1.0 30633 2.0720
1.9288 2.0 61266 1.9033
1.8387 3.0 91899 1.8330
1.7832 4.0 122532 1.7864
1.7445 5.0 153165 1.7592
1.7152 6.0 183798 1.7404
1.6933 7.0 214431 1.7208
1.6743 8.0 245064 1.7005
1.6561 9.0 275697 1.6907
1.6431 10.0 306330 1.6903
1.6282 11.0 336963 1.6801
1.6173 12.0 367596 1.6714
1.6061 13.0 398229 1.6634
1.5971 14.0 428862 1.6543
1.5867 15.0 459495 1.6488
1.5781 16.0 490128 1.6447
1.5684 17.0 520761 1.6388
1.5597 18.0 551394 1.6416
1.5521 19.0 582027 1.6370
1.5438 20.0 612660 1.6365

Framework versions

  • Transformers 4.18.0.dev0
  • Pytorch 1.11.0+cu102
  • Datasets 1.18.4
  • Tokenizers 0.11.6