Edit model card

w2v-bert-grain-lg_cv_only_v2

This model is a fine-tuned version of facebook/w2v-bert-2.0 on the common_voice_17_0 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6384
  • Wer: 0.2320
  • Cer: 0.0721

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 80
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
0.3102 1.0 8884 0.4540 0.3644 0.1028
0.2032 2.0 17768 0.3881 0.3005 0.0845
0.1687 3.0 26652 0.4061 0.3139 0.0883
0.148 4.0 35536 0.4048 0.2879 0.0825
0.1327 5.0 44420 0.4136 0.2860 0.0831
0.1191 6.0 53304 0.3685 0.2889 0.0843
0.1087 7.0 62188 0.4108 0.2630 0.0810
0.0996 8.0 71072 0.3682 0.2628 0.0789
0.0918 9.0 79956 0.4126 0.2672 0.0779
0.0854 10.0 88840 0.3482 0.2628 0.0783
0.0778 11.0 97724 0.3948 0.2540 0.0773
0.0719 12.0 106608 0.3530 0.2477 0.0740
0.066 13.0 115492 0.4267 0.2604 0.0785
0.0595 14.0 124376 0.3779 0.2467 0.0727
0.0541 15.0 133260 0.4424 0.2622 0.0813
0.0485 16.0 142144 0.3848 0.2500 0.0755
0.044 17.0 151028 0.3752 0.2465 0.0736
0.0391 18.0 159912 0.3722 0.2524 0.0753
0.0347 19.0 168796 0.4386 0.2481 0.0762
0.0309 20.0 177680 0.4647 0.2552 0.0788
0.0273 21.0 186564 0.4453 0.2468 0.0736
0.0252 22.0 195448 0.4612 0.2450 0.0750
0.0229 23.0 204332 0.4624 0.2510 0.0750
0.0209 24.0 213216 0.4640 0.2535 0.0739
0.0186 25.0 222100 0.4309 0.2542 0.0747
0.0173 26.0 230984 0.4339 0.2490 0.0734
0.016 27.0 239868 0.4463 0.2477 0.0740
0.0143 28.0 248752 0.5788 0.2432 0.0784
0.0135 29.0 257636 0.4962 0.2482 0.0745
0.0124 30.0 266520 0.5620 0.2448 0.0794
0.0116 31.0 275404 0.5030 0.2419 0.0749
0.0108 32.0 284288 0.4731 0.2374 0.0729
0.0099 33.0 293172 0.4890 0.2425 0.0736
0.0095 34.0 302056 0.5449 0.2449 0.0783
0.0086 35.0 310940 0.5007 0.2355 0.0726
0.0082 36.0 319824 0.4715 0.2372 0.0738
0.0079 37.0 328708 0.5407 0.2430 0.0731
0.0072 38.0 337592 0.5361 0.2374 0.0738
0.0068 39.0 346476 0.5152 0.2459 0.0755
0.0063 40.0 355360 0.4737 0.2316 0.0715
0.0058 41.0 364244 0.5980 0.2391 0.0779
0.0052 42.0 373128 0.5633 0.2360 0.0727
0.0051 43.0 382012 0.5640 0.2352 0.0732
0.0046 44.0 390896 0.5674 0.2270 0.0710
0.0044 45.0 399780 0.5487 0.2352 0.0717
0.0042 46.0 408664 0.6279 0.2436 0.0786
0.0039 47.0 417548 0.6260 0.2438 0.0770
0.0038 48.0 426432 0.5995 0.2328 0.0763
0.0036 49.0 435316 0.6540 0.2403 0.0776
0.0031 50.0 444200 0.5347 0.2370 0.0747
0.0028 51.0 453084 0.6086 0.2490 0.0739
0.0026 52.0 461968 0.5515 0.2287 0.0693
0.0025 53.0 470852 0.6788 0.2414 0.0793
0.0023 54.0 479736 0.6384 0.2320 0.0721

Framework versions

  • Transformers 4.46.1
  • Pytorch 2.1.0+cu118
  • Datasets 3.1.0
  • Tokenizers 0.20.1
Downloads last month
6
Safetensors
Model size
606M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for sulaimank/w2v-bert-grain-lg_cv_only_v2

Finetuned
(179)
this model

Evaluation results