alvanlii
/

whisper-small-cantonese

Automatic Speech Recognition

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

alvanlii commited on Dec 13, 2022

Commit

d7d406e

•

1 Parent(s): 1806b43

Added datasets a new training data

Files changed (1) hide show

README.md +29 -5

README.md CHANGED Viewed

@@ -20,9 +20,9 @@ model-index:
       split: test
       args: zh-HK
     metrics:
-    - name: Wer
-      type: wer
-      value: 56.0439
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
@@ -36,10 +36,34 @@ More information needed
 ## Intended uses & limitations
 More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure
-### Training hyperparameters
 ### Framework versions

       split: test
       args: zh-HK
     metrics:
+    - name: Cer
+      type: cer
+      value: 11.760
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
 ## Intended uses & limitations
 More information needed
 ## Training and evaluation data
+For training, three datasets were used:
+- Common Voice 11 Canto Train Set
+- CantoMap: Winterstein, Grégoire, Tang, Carmen and Lai, Regine (2020) "CantoMap: a Hong Kong Cantonese MapTask Corpus", in Proceedings of The 12th Language Resources and Evaluation Conference, Marseille: European Language Resources Association, p. 2899-2906.
+- Cantonse-ASR: Yu, Tiezheng, Frieske, Rita, Xu, Peng, Cahyawijaya, Samuel, Yiu, Cheuk Tung, Lovenia, Holy, Dai, Wenliang, Barezi, Elham, Chen, Qifeng, Ma, Xiaojuan, Shi, Bertram, Fung, Pascale (2022) "Automatic Speech Recognition Datasets in Cantonese: A Survey and New Dataset", 2022. Link: https://arxiv.org/pdf/2201.02419.pdf
 ## Training procedure
+## Training Hyperparameters
+- learning_rate: 1e-5
+- train_batch_size: 16 (on 2 GPUs)
+- eval_batch_size: 8
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 16x2x2=64
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 500
+- training_steps: 5000
+- mixed_precision_training: Native AMP
+## Training Results
+| Training Loss | Epoch | Step | Validation Loss | Cer    |
+|:-------------:|:-----:|:----:|:---------------:|:------:|
+| 0.1106        | 0.66  | 1000 | 0.3294          | 14.638 |
+| 0.0546        | 1.33  | 2000 | 0.2887          | 12.119 |
+| 0.0293        | 2.01  | 3000 | 0.2727          | 11.646 |
+| 0.0214        | 2.66  | 4000 | 0.2741          | 11.760 |
+| xx           | xx | 5000 | xx          | xx |
 ### Framework versions