alvanlii commited on
Commit
309b69d
1 Parent(s): 5c72a5a

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +62 -0
README.md ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - zh
4
+ license: apache-2.0
5
+ tags:
6
+ - whisper-event
7
+ - generated_from_trainer
8
+ datasets:
9
+ - mozilla-foundation/common_voice_11_0
10
+ model-index:
11
+ - name: Whisper Small zh-HK - Alvin
12
+ results:
13
+ - task:
14
+ name: Automatic Speech Recognition
15
+ type: automatic-speech-recognition
16
+ dataset:
17
+ name: mozilla-foundation/common_voice_11_0 zh-HK
18
+ type: mozilla-foundation/common_voice_11_0
19
+ config: zh-HK
20
+ split: test
21
+ args: zh-HK
22
+ metrics:
23
+ - name: Normalized CER
24
+ type: cer
25
+ value: 10.11
26
+ ---
27
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
28
+ should probably proofread and complete it, then remove this comment. -->
29
+
30
+ # Whisper Large V2 zh-HK - Alvin
31
+
32
+ This model is a fine-tuned version of [openai/whisper-large-v2](https://huggingface.co/openai/whisper-large-v2) on the Common Voice 11.0 dataset. This is trained with PEFT LoRA+BNB INT8.
33
+
34
+ ## Training and evaluation data
35
+ For training, three datasets were used:
36
+ - Common Voice 11 Canto Train Set
37
+ - CantoMap: Winterstein, Grégoire, Tang, Carmen and Lai, Regine (2020) "CantoMap: a Hong Kong Cantonese MapTask Corpus", in Proceedings of The 12th Language Resources and Evaluation Conference, Marseille: European Language Resources Association, p. 2899-2906.
38
+ - Cantonse-ASR: Yu, Tiezheng, Frieske, Rita, Xu, Peng, Cahyawijaya, Samuel, Yiu, Cheuk Tung, Lovenia, Holy, Dai, Wenliang, Barezi, Elham, Chen, Qifeng, Ma, Xiaojuan, Shi, Bertram, Fung, Pascale (2022) "Automatic Speech Recognition Datasets in Cantonese: A Survey and New Dataset", 2022. Link: https://arxiv.org/pdf/2201.02419.pdf
39
+
40
+ ## Training Hyperparameters
41
+ - learning_rate: 5e-5
42
+ - train_batch_size: 60 (on 1 3090 GPU)
43
+ - eval_batch_size: 10
44
+ - gradient_accumulation_steps: 1
45
+ - total_train_batch_size: 60x1x1=60
46
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
47
+ - lr_scheduler_type: linear
48
+ - lr_scheduler_warmup_steps: 500
49
+ - training_steps: 15000
50
+ - augmentation: SpecAugment
51
+
52
+ ## Training Results
53
+
54
+ | Training Loss | Epoch | Step | Validation Loss | Normalized CER |
55
+ |:-------------:|:-----:|:----:|:---------------:|:------:|
56
+ | 0.4610 | 0.55 | 2000 | 0.3106 | 13.08 |
57
+ | 0.3441 | 1.11 | 4000 | 0.2875 | 11.79 |
58
+ | 0.3466 | 1.66 | 6000 | 0.2820 | 11.44 |
59
+ | 0.2539 | 2.22 | 8000 | 0.2777 | 10.59 |
60
+ | 0.2312 | 2.77 | 10000 | 0.2822 | 10.60 |
61
+ | 0.1639 | 3.32 | 12000 | 0.2859 | 10.17 |
62
+ | 0.1569 | 3.88 | 14000 | 0.2866 | 10