license: apache-2.0 | |
language: | |
- ja | |
library_name: espnet | |
tags: | |
- automatic-speech-recognition | |
# reazonspeech-espnet-v2 | |
`reazonspeech-espnet-v2` is an automatic speech recognition (ASR) model | |
trained on [ReazonSpeech v2.0 corpus](https://huggingface.co/datasets/reazon-research/reazonspeech). | |
## Model Architecture | |
The general architecture is the same as [reazonspeech-espnet-v1](https://huggingface.co/reazon-research/reazonspeech-espnet-v1). | |
* Conformer-Transducer model with 118.85M parameters. | |
* We trained this model for 33 epoch using Adam optimizer. The maximum | |
learning rate was 0.02, with 15000 warmup steps. | |
* The training audio files were sampled at 16khz. Make sure that your | |
input audio files have the same sampling rate. | |
## Usage | |
We recommend to use this model through our | |
[reazonspeech](https://github.com/reazon-research/reazonspeech) | |
library. | |
``` | |
from reazonspeech.espnet.asr import load_model, transcribe, audio_from_path | |
audio = audio_from_path("speech.wav") | |
model = load_model() | |
ret = transcribe(model, audio) | |
print(ret.text) | |
``` | |
## License | |
[Apaceh Licence 2.0](https://choosealicense.com/licenses/apache-2.0/) | |