Spaces:
Running
Enabling custom Japanese model
In the config file I noticed the part about uncommenting the section to enable custom Japanese models. I tried uncommenting it but it let me down a rabbithole of errors starting with
ValueError: Invalid model size 'vumichien/whisper-large-v2-mix-jp', expected one of: tiny.en, tiny, base.en, base, small.en, small, medium.en, medium, large-v1, large-v2
and as I tried fixing each one new ones just kept popping up. So my question is, is this feature not implemented yet or am I missing something obvious?
Thanks in advance!
There's currently no support for automatic conversion to the model type (CTranslate2) used by faster-whisper
, but you can do this manually using the CLI.
For instance, to convert vumichien/whisper-large-v2-mix-jp to CTranslate2, first download the repository locally:
git lfs install
git clone git clone https://huggingface.co/vumichien/whisper-large-v2-mix-jp/tree/main
Then add the tokenizer.json file from openai/whisper-large-v2, as it appears to be missing in whisper-large-v2-mix-jp:
cd whisper-large-v2-mix-jp
wget https://huggingface.co/openai/whisper-large-v2/raw/main/tokenizer.json
Then, use the ct2-transformers-converter
CLI (as explained in the README of faster-whisper) to convert the model in whisper-large-v2-mix-jp
to CTranslate2 (might want to use Anaconda or a Virtual Python environment here):
cd ..
pip install transformers[torch]>=4.23
ct2-transformers-converter --model ./whisper-large-v2-mix-jp --output_dir whisper-large-v2-ct2 --copy_files tokenizer.json --quantization float16
That should produce the converted model in the directory whisper-large-v2-mix-jp
:
You can now reference this directory directly if you use a type other than "whisper", here I just set it to "filesystem" to disable any conversion. I also used a Windows-style path, but you can use Unix paths if you are on Linux/Mac):
{
"name": "whisper-large-v2-mix-jp" ,
"url": "J:\\Dev\\Models\\whisper\\whisper-large-v2-mix-jp-ct2",
"type": "filesystem",
}
But yeah, ideally faster-whisper ought to be able to do this conversion automatically. It just hasn't been a priority, as I find "large-v2" to be more than good enough for my use.
Thanks a lot, I got it to work!
I agree in general the large-v2 model is very good. Just some topics sometimes translate a bit weird. Was mostly curious how much better this specialized set actually is.
Again thanks for the help on this and thanks in general for the project. Its been a big help!