akrishnan's picture
End of training
34ea21a verified
|
raw
history blame
2.65 kB
metadata
license: mit
base_model: facebook/w2v-bert-2.0
tags:
  - generated_from_trainer
metrics:
  - wer
model-index:
  - name: malayalam_combined_Conversation
    results: []

Visualize in Weights & Biases

malayalam_combined_Conversation

This model is a fine-tuned version of facebook/w2v-bert-2.0 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9570
  • Wer: 0.6223

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 50
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
1.3673 0.6177 500 1.3771 0.7996
1.1485 1.2353 1000 1.2069 0.7644
1.0712 1.8530 1500 1.1157 0.7296
1.0101 2.4707 2000 1.0969 0.7344
0.9326 3.0883 2500 1.0566 0.6889
0.8723 3.7060 3000 1.0339 0.6861
0.8198 4.3237 3500 1.0028 0.6830
0.8092 4.9413 4000 1.0108 0.6681
0.7574 5.5590 4500 1.0049 0.6676
0.7027 6.1767 5000 0.9725 0.6660
0.6981 6.7943 5500 0.9649 0.6653
0.6684 7.4120 6000 0.9500 0.6393
0.6295 8.0296 6500 0.9535 0.6364
0.5947 8.6473 7000 0.9522 0.6338
0.5483 9.2650 7500 0.9821 0.6262
0.5437 9.8826 8000 0.9570 0.6223

Framework versions

  • Transformers 4.43.0.dev0
  • Pytorch 1.14.0a0+44dac51
  • Datasets 2.16.1
  • Tokenizers 0.19.1