Edit model card

w2v-bert-2.0-nonstudio_and_studioRecords_final

This model is a fine-tuned version of facebook/w2v-bert-2.0 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1772
  • Wer: 0.1266

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
1.055 0.4601 600 0.3683 0.4608
0.1734 0.9202 1200 0.2620 0.3546
0.1242 1.3804 1800 0.2115 0.3018
0.1075 1.8405 2400 0.2004 0.2889
0.0888 2.3006 3000 0.1870 0.2573
0.078 2.7607 3600 0.1724 0.2267
0.0664 3.2209 4200 0.1572 0.2244
0.0576 3.6810 4800 0.1746 0.2217
0.0522 4.1411 5400 0.1643 0.1796
0.0415 4.6012 6000 0.1781 0.1851
0.0398 5.0613 6600 0.1670 0.1714
0.0301 5.5215 7200 0.1531 0.1617
0.0296 5.9816 7800 0.1463 0.1590
0.0211 6.4417 8400 0.1566 0.1473
0.0206 6.9018 9000 0.1423 0.1468
0.0147 7.3620 9600 0.1443 0.1413
0.0136 7.8221 10200 0.1539 0.1418
0.0105 8.2822 10800 0.1611 0.1383
0.0079 8.7423 11400 0.1761 0.1351
0.0063 9.2025 12000 0.1814 0.1304
0.0043 9.6626 12600 0.1772 0.1266

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.1+cu121
  • Datasets 2.19.2
  • Tokenizers 0.19.1
Downloads last month
226
Safetensors
Model size
606M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Bajiyo/w2v-bert-2.0-nonstudio_and_studioRecords_final

Finetuned
(183)
this model