File size: 5,467 Bytes

---
license: apache-2.0
base_model: facebook/wav2vec2-large-xlsr-53
tags:
- generated_from_trainer
datasets:
- common_voice
metrics:
- wer
model-index:
- name: wav2vec2-large-xlsr53-zh-cn-subset-colab
  results:
  - task:
      name: Automatic Speech Recognition
      type: automatic-speech-recognition
    dataset:
      name: common_voice
      type: common_voice
      config: zh-CN
      split: test[:20%]
      args: zh-CN
    metrics:
    - name: Wer
      type: wer
      value: 0.9394977168949772
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# wav2vec2-large-xlsr53-zh-cn-subset-colab

This model is a fine-tuned version of [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) on the common_voice dataset.
It achieves the following results on the evaluation set:
- Loss: 1.3992
- Wer: 0.9395
- Cer: 0.3184

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 13
- eval_batch_size: 2
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 26
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 100

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Wer    | Cer    |
|:-------------:|:-----:|:-----:|:---------------:|:------:|:------:|
| No log        | 1.9   | 400   | 33.6533         | 1.0    | 1.0    |
| 70.5767       | 3.81  | 800   | 6.8140          | 1.0    | 1.0    |
| 7.1379        | 5.71  | 1200  | 6.5163          | 1.0    | 1.0    |
| 6.4771        | 7.62  | 1600  | 6.4602          | 1.0    | 1.0    |
| 6.3627        | 9.52  | 2000  | 6.3406          | 1.0    | 0.9700 |
| 6.3627        | 11.43 | 2400  | 6.1021          | 1.0    | 0.9678 |
| 6.1201        | 13.33 | 2800  | 5.1523          | 1.0    | 0.8385 |
| 5.3531        | 15.24 | 3200  | 4.2224          | 1.0    | 0.7084 |
| 4.1733        | 17.14 | 3600  | 3.6981          | 1.0    | 0.6332 |
| 3.5472        | 19.05 | 4000  | 3.2708          | 0.9994 | 0.5827 |
| 3.5472        | 20.95 | 4400  | 2.9629          | 0.9989 | 0.5510 |
| 3.0668        | 22.86 | 4800  | 2.7122          | 0.9943 | 0.5165 |
| 2.7248        | 24.76 | 5200  | 2.5171          | 0.9914 | 0.4976 |
| 2.4609        | 26.67 | 5600  | 2.3538          | 0.9897 | 0.4759 |
| 2.2323        | 28.57 | 6000  | 2.2112          | 0.9874 | 0.4555 |
| 2.2323        | 30.48 | 6400  | 2.0850          | 0.9834 | 0.4370 |
| 2.0438        | 32.38 | 6800  | 1.9982          | 0.9806 | 0.4261 |
| 1.8837        | 34.29 | 7200  | 1.9179          | 0.9766 | 0.4137 |
| 1.7646        | 36.19 | 7600  | 1.8278          | 0.9766 | 0.4030 |
| 1.6469        | 38.1  | 8000  | 1.7627          | 0.9755 | 0.3937 |
| 1.6469        | 40.0  | 8400  | 1.7063          | 0.9709 | 0.3853 |
| 1.5422        | 41.9  | 8800  | 1.6649          | 0.9663 | 0.3787 |
| 1.4561        | 43.81 | 9200  | 1.6336          | 0.9697 | 0.3714 |
| 1.3842        | 45.71 | 9600  | 1.5943          | 0.9606 | 0.3647 |
| 1.3164        | 47.62 | 10000 | 1.5681          | 0.9669 | 0.3621 |
| 1.3164        | 49.52 | 10400 | 1.5535          | 0.9600 | 0.3582 |
| 1.2654        | 51.43 | 10800 | 1.5354          | 0.9538 | 0.3544 |
| 1.2186        | 53.33 | 11200 | 1.5003          | 0.9555 | 0.3482 |
| 1.1781        | 55.24 | 11600 | 1.4979          | 0.9572 | 0.3473 |
| 1.1344        | 57.14 | 12000 | 1.4820          | 0.9549 | 0.3453 |
| 1.1344        | 59.05 | 12400 | 1.4707          | 0.9509 | 0.3396 |
| 1.0965        | 60.95 | 12800 | 1.4657          | 0.9509 | 0.3384 |
| 1.0637        | 62.86 | 13200 | 1.4610          | 0.9509 | 0.3371 |
| 1.0306        | 64.76 | 13600 | 1.4461          | 0.9509 | 0.3361 |
| 1.0014        | 66.67 | 14000 | 1.4437          | 0.9503 | 0.3328 |
| 1.0014        | 68.57 | 14400 | 1.4334          | 0.9463 | 0.3304 |
| 0.9758        | 70.48 | 14800 | 1.4267          | 0.9429 | 0.3295 |
| 0.9486        | 72.38 | 15200 | 1.4250          | 0.9469 | 0.3269 |
| 0.933         | 74.29 | 15600 | 1.4214          | 0.9441 | 0.3273 |
| 0.9121        | 76.19 | 16000 | 1.4161          | 0.9441 | 0.3267 |
| 0.9121        | 78.1  | 16400 | 1.4137          | 0.9446 | 0.3268 |
| 0.9001        | 80.0  | 16800 | 1.4216          | 0.9446 | 0.3253 |
| 0.8789        | 81.9  | 17200 | 1.4164          | 0.9435 | 0.3264 |
| 0.8659        | 83.81 | 17600 | 1.3996          | 0.9424 | 0.3216 |
| 0.8471        | 85.71 | 18000 | 1.4079          | 0.9458 | 0.3226 |
| 0.8471        | 87.62 | 18400 | 1.4042          | 0.9412 | 0.3214 |
| 0.8387        | 89.52 | 18800 | 1.4073          | 0.9424 | 0.3214 |
| 0.8299        | 91.43 | 19200 | 1.4005          | 0.9418 | 0.3192 |
| 0.8257        | 93.33 | 19600 | 1.4040          | 0.9406 | 0.3200 |
| 0.813         | 95.24 | 20000 | 1.4012          | 0.9412 | 0.3184 |
| 0.813         | 97.14 | 20400 | 1.4011          | 0.9389 | 0.3183 |
| 0.8062        | 99.05 | 20800 | 1.3992          | 0.9395 | 0.3184 |


### Framework versions

- Transformers 4.32.0.dev0
- Pytorch 2.0.1+cu118
- Datasets 2.14.4
- Tokenizers 0.13.3