File size: 15,083 Bytes
d5c6b56
 
 
 
 
 
 
 
7b3bf54
d5c6b56
1785d49
0bc616c
f033ea6
2d01c47
03872a0
 
afb56b6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c03d6e0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
afb56b6
 
e42b13a
03872a0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
---
title: README
emoji: 🐨
colorFrom: blue
colorTo: yellow
sdk: static
pinned: false
---
# Japanese ASR

This repository contains all the models and datasets for train/evaluate the Japanese ASR dataset generated through the process of achieving [kotoba-whisper models](https://huggingface.co/collections/kotoba-tech/kotoba-whisper-661d04846a2892cc27a23921).
Following table shows CER comparison with different data size of ReazonSpeech used to distill [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3). The model names follows
`japanese-asr/distil-whisper-large-v3-ja-reazonspeech-{size of reazonspeech}`.

***CER***

| model                                                                                                                                             |   [CommonVoice 8 (Japanese test set)](https://huggingface.co/datasets/japanese-asr/ja_asr.common_voice_8_0) |   [JSUT Basic 5000](https://huggingface.co/datasets/japanese-asr/ja_asr.jsut_basic5000) |   [ReazonSpeech (held out test set)](https://huggingface.co/datasets/japanese-asr/ja_asr.reazonspeech_test) |
|:--------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------:|----------------------------------------------------------------------------------------:|------------------------------------------------------------------------------------------------------------:|
| [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-all](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-all)       |                                                                                                         9.2 |                                                                                     8.4 |                                                                                                        11.6 |
| [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-large](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-large)   |                                                                                                         9.4 |                                                                                     8.5 |                                                                                                        12.2 |
| [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-medium](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-medium) |                                                                                                        10.9 |                                                                                    11.3 |                                                                                                        14.8 |
| [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-small](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-small)   |                                                                                                        30.2 |                                                                                    39   |                                                                                                        40.7 |
| [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-tiny](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-tiny)     |                                                                                                        94.8 |                                                                                    96.3 |                                                                                                        96.7 |
| [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3)                                                                         |                                                                                                         8.5 |                                                                                     7.1 |                                                                                                        14.9 |
| [openai/whisper-large-v2](https://huggingface.co/openai/whisper-large-v2)                                                                         |                                                                                                         9.7 |                                                                                     8.2 |                                                                                                        28.1 |
| [openai/whisper-large](https://huggingface.co/openai/whisper-large)                                                                               |                                                                                                        10   |                                                                                     8.9 |                                                                                                        34.1 |
| [openai/whisper-medium](https://huggingface.co/openai/whisper-medium)                                                                             |                                                                                                        11.5 |                                                                                    10   |                                                                                                        33.2 |
| [openai/whisper-base](https://huggingface.co/openai/whisper-base)                                                                                 |                                                                                                        28.6 |                                                                                    24.9 |                                                                                                        70.4 |
| [openai/whisper-small](https://huggingface.co/openai/whisper-small)                                                                               |                                                                                                        15.1 |                                                                                    14.2 |                                                                                                        41.5 |
| [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny)                                                                                 |                                                                                                        53.7 |                                                                                    36.5 |                                                                                                       137.9 |
| [reazon-research/reazonspeech-nemo-v2](https://huggingface.co/reazon-research/reazonspeech-nemo-v2)                                               |                                                                                                         9.1 |                                                                                     7.4 |                                                                                                        11.2 | 

***WER***
| model                                                                                                                                             |   [CommonVoice 8 (Japanese test set)](https://huggingface.co/datasets/japanese-asr/ja_asr.common_voice_8_0) |   [JSUT Basic 5000](https://huggingface.co/datasets/japanese-asr/ja_asr.jsut_basic5000) |   [ReazonSpeech (held out test set)](https://huggingface.co/datasets/japanese-asr/ja_asr.reazonspeech_test) |
|:--------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------:|----------------------------------------------------------------------------------------:|------------------------------------------------------------------------------------------------------------:|
| [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-all](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-all)       |                                                                                                        58.8 |                                                                                    63.7 |                                                                                                        55.6 |
| [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-large](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-large)   |                                                                                                        59.2 |                                                                                    64.3 |                                                                                                        56.4 |
| [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-medium](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-medium) |                                                                                                        64.6 |                                                                                    72.1 |                                                                                                        63   |
| [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-small](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-small)   |                                                                                                        85   |                                                                                    94.2 |                                                                                                        82.1 |
| [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-tiny](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-tiny)     |                                                                                                       100   |                                                                                   100   |                                                                                                        99   |
| [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3)                                                                         |                                                                                                        55.1 |                                                                                    59.2 |                                                                                                        60.2 |
| [openai/whisper-large-v2](https://huggingface.co/openai/whisper-large-v2)                                                                         |                                                                                                        59.3 |                                                                                    63.2 |                                                                                                        74.1 |
| [openai/whisper-large](https://huggingface.co/openai/whisper-large)                                                                               |                                                                                                        61.1 |                                                                                    66.4 |                                                                                                        74.9 |
| [openai/whisper-medium](https://huggingface.co/openai/whisper-medium)                                                                             |                                                                                                        63.4 |                                                                                    69.5 |                                                                                                        76   |
| [openai/whisper-base](https://huggingface.co/openai/whisper-base)                                                                                 |                                                                                                        87.2 |                                                                                    93   |                                                                                                        91.8 |
| [openai/whisper-small](https://huggingface.co/openai/whisper-small)                                                                               |                                                                                                        74.2 |                                                                                    81.9 |                                                                                                        83   |
| [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny)                                                                                 |                                                                                                        93.8 |                                                                                    97.6 |                                                                                                        94.9 |
| [reazon-research/reazonspeech-nemo-v2](https://huggingface.co/reazon-research/reazonspeech-nemo-v2)                                               |                                                                                                        57.5 |                                                                                    60.6 |                                                                                                        47.5 | 




Note that [kotoba-tech/kotoba-whisper-v1.0](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.0) is an alias of [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-large](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-large) 
and [kotoba-tech/kotoba-whisper-v2.0](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.0) is an alias of [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-all](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-all).

Please find more detailed results at [kotoba-whisper codebase](https://github.com/kotoba-tech/kotoba-whisper).