Multilingual PL-BERT checkpoint
The checkpoint open-sourced here is trained by Papercup using the open-source PL-BERT model found here https://github.com/yl4579/PL-BERT. It is trained to be supported by StyleTTS2, which can be found here: https://github.com/yl4579/StyleTTS2. You can see in the model card the languages that it has been trained on (the languages correspond to the crowdsourced dataset found here https://huggingface.co/datasets/styletts2-community/multilingual-phonemes-10k-alpha).
Notable differences compared to the default PL-BERT checkpoint and config available here:
- Because we are working with many languages, we are using a different tokenizer now:
bert-base-multilingual-cased
. - The PL-BERT model was trained on the data obtained from
styletts2-community/multilingual-phonemes-10k-alpha
for 1.1M iterations. - The
token_maps.pkl
file has changed (also open-sourced here). - We have changed the
util.py
file to deal with an error when loadingnew_state_dict["embeddings.position_ids"]
.
How do I train StyleTTS2 with this new PL-BERT checkpoint?
- Simply create a new folder under
Utils
in your StyleTTS2 repository. Call it, for example,PLBERT_all_languages
. - Copy paste into it
config.yml
,step_1100000.t7
andutil.py
. - Then, in your StyleTTS2 config file, change
PLBERT_dir
toUtils/PLBERT_all_languages
. You will also need to change your import as such:- Change
from Utils.PLBERT.util import load_plbert
- To
from Utils.PLBERT_all_languages.util import load_plbert
- Alternatively, you can just replace the relevant files in
Utils/PLBERT
and not have to change any code.
- Change
- Now, you need to create train and validation files. You will need to use
espeak
to create a file in the same format as the ones that exist in theData
folder of the StyleTTS2 repository. Careful! You will need to change thelanguage
argument to phonemise your text if it's not in English. You can find the correct language codes here. For example, Latin American Spanish ises-419
Voila, you can now train a multilingual StyleTTS2 model!
Thank you to Aaron (Yinghao) Li for these contributions.