updated readme
Browse files
README.md
CHANGED
@@ -26,21 +26,48 @@ model-index:
|
|
26 |
- name: Wer
|
27 |
type: wer
|
28 |
value: 3.8273540533062804
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
29 |
---
|
30 |
|
31 |
-
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
32 |
-
should probably proofread and complete it, then remove this comment. -->
|
33 |
-
|
34 |
# Whisper Medium Indonesian
|
35 |
|
36 |
-
This model is a fine-tuned version of [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) on the
|
37 |
-
|
|
|
|
|
38 |
- Loss: 0.0698
|
39 |
- Wer: 3.8274
|
40 |
-
|
41 |
-
|
42 |
-
|
43 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
44 |
|
45 |
## Intended uses & limitations
|
46 |
|
@@ -80,7 +107,29 @@ The following hyperparameters were used during training:
|
|
80 |
| 0.0122 | 2.98 | 9000 | 0.0714 | 3.9795 |
|
81 |
| 0.0049 | 3.31 | 10000 | 0.0720 | 3.9887 |
|
82 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
83 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
84 |
### Framework versions
|
85 |
|
86 |
- Transformers 4.26.0.dev0
|
|
|
26 |
- name: Wer
|
27 |
type: wer
|
28 |
value: 3.8273540533062804
|
29 |
+
- task:
|
30 |
+
name: Automatic Speech Recognition
|
31 |
+
type: automatic-speech-recognition
|
32 |
+
dataset:
|
33 |
+
name: google/fleurs id_id
|
34 |
+
type: google/fleurs
|
35 |
+
config: id_id
|
36 |
+
split: test
|
37 |
+
metrics:
|
38 |
+
- name: Wer
|
39 |
+
type: wer
|
40 |
+
value: 9.74
|
41 |
+
|
42 |
---
|
43 |
|
|
|
|
|
|
|
44 |
# Whisper Medium Indonesian
|
45 |
|
46 |
+
This model is a fine-tuned version of [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) on the
|
47 |
+
Indonesian mozilla-foundation/common_voice_11_0, magic_data, titml and google/fleurs dataset. It achieves the following
|
48 |
+
results:
|
49 |
+
### CV11 test split:
|
50 |
- Loss: 0.0698
|
51 |
- Wer: 3.8274
|
52 |
+
### Google/fleurs test split:
|
53 |
+
- Wer: 9.74
|
54 |
+
|
55 |
+
## Usage
|
56 |
+
|
57 |
+
```python
|
58 |
+
from transformers import pipeline
|
59 |
+
transcriber = pipeline(
|
60 |
+
"automatic-speech-recognition",
|
61 |
+
model="cahya/whisper-medium-id"
|
62 |
+
)
|
63 |
+
transcriber.model.config.forced_decoder_ids = (
|
64 |
+
transcriber.tokenizer.get_decoder_prompt_ids(
|
65 |
+
language="id"
|
66 |
+
task="transcribe"
|
67 |
+
)
|
68 |
+
)
|
69 |
+
transcription = transcriber("my_audio_file.mp3")
|
70 |
+
```
|
71 |
|
72 |
## Intended uses & limitations
|
73 |
|
|
|
107 |
| 0.0122 | 2.98 | 9000 | 0.0714 | 3.9795 |
|
108 |
| 0.0049 | 3.31 | 10000 | 0.0720 | 3.9887 |
|
109 |
|
110 |
+
## Evaluation
|
111 |
+
|
112 |
+
We evaluated the model using the test split of two datasets, the [Common Voice 11](https://huggingface.co/datasets/mozilla-foundation/common_voice_11_0)
|
113 |
+
and the [Google Fleurs](https://huggingface.co/datasets/google/fleurs).
|
114 |
+
As Whisper can transcribe casing and punctuation, we also evaluate its performance using raw and normalized text.
|
115 |
+
(lowercase + removal of punctuations). The results are as follows:
|
116 |
+
|
117 |
+
### Common Voice 11
|
118 |
+
|
119 |
+
| | WER |
|
120 |
+
|---------------------------------------------------------------------------|------|
|
121 |
+
| [cahya/whisper-medium-id](https://huggingface.co/cahya/whisper-medium-id) | 3.83 |
|
122 |
+
| [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) | tbc |
|
123 |
+
|
124 |
+
### Google/Fleurs
|
125 |
|
126 |
+
| | WER |
|
127 |
+
|-------------------------------------------------------------------------------------------------------------|------|
|
128 |
+
| [cahya/whisper-medium-id](https://huggingface.co/cahya/whisper-medium-id) | 9.74 |
|
129 |
+
| [cahya/whisper-medium-id](https://huggingface.co/cahya/whisper-medium-id) + text normalization | tbc |
|
130 |
+
| [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) | tbc |
|
131 |
+
| [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) + text normalization | tbc |
|
132 |
+
|
|
133 |
### Framework versions
|
134 |
|
135 |
- Transformers 4.26.0.dev0
|