How to fine fine MMS text to speech models?
Is there any way to work around fine tuning MMS-TTS models?
Any updates on this??
It's ongoing! The model addition is the final review stages, then we can work on a fine-tuning script (cc @ylacombe )
any updates on this? @sanchit-gandhi ?
Following
any updates on this please?
Hi there, I'm currently working on finetuning VITS and MMS, stay tuned!
Hey @arbianqx , it's still a WIP.
If you are interested, here are the two ongoing PRs on which I'm working on: https://github.com/huggingface/transformers/pull/27340 https://github.com/huggingface/transformers/pull/27244
Note that as long as the PRs are not merged, I can't really give you support on this.
On another note, what languages are you interested in? Finetuning MMS is an interesting task, and I'm trying to understand which languages are the most interesting to work on!
Hi @ylacombe any update on this?
Hey, I haven't made any official announcements yet, but you can already find what you want in the following library: https://github.com/ylacombe/finetune-hf-vits
Don't hesitate to give feedback and share your finetuned models if you can!
Hey, Thank you very much @ylacombe and the team. Appreciate. ππΎ
Hi @ylacombe , hope you're doing good. Can you please help me, I want to finetune a MMS-TTS (facebook/mms-tts-urd-script_arabic), it's for urdu language. I actually want it to finetune on a specific speaker audio. How can I create a speaker embedding for the speaker and finetune the model so it provide me the audio of that particular speaker. Also, please tell me how can I do it if I want multiple speaker in the same model. Your help would be appreciated. Happy New Year!!!
@sanchit-gandhi Hi Brother, any update on the finetuning of MMS-TTS (facebook/mms-tts-urd-script-arabic)?
@sanchit-gandhi Hi Brother, any update on the finetuning of MMS-TTS (facebook/mms-tts-urd-script-arabic)?
Yeah, it's working great!!! thanks to @ylacombe
@syedmuhammad Thank you for the response, can you please refer me the link. Thanks
@syedmuhammad Thank you for the response, can you please refer me the link. Thanks
kindly refer the repo: https://github.com/ylacombe/finetune-hf-vits
@syedmuhammad
Thanks, I will check this.
Have you your own training colab notebook for urdu language using the following model ?
facebook/mms-tts-urd-script_arabic
@syedmuhammad Thanks, I will check this.
Have you your own training colab notebook for urdu language using the following model ?
facebook/mms-tts-urd-script_arabic
Yes
@syedmuhammad Would you like to share it.
@syedmuhammad Would you like to share it.
You can email me at: [email protected]
during fieturning i got return tensor.to(device, non_blocking=non_blocking)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: BatchEncoding.to() got an unexpected keyword argument 'non_blocking'
solution?
during fieturning i got return tensor.to(device, non_blocking=non_blocking)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: BatchEncoding.to() got an unexpected keyword argument 'non_blocking'solution?
@charbossly Maybe this solution will help: https://github.com/ylacombe/finetune-hf-vits/issues/22
@khof312 Thanks for your solution .
Hello,
I recently finetuned an MMS model using Hugging Face tools provided at https://github.com/ylacombe/finetune-hf-vits and have successfully obtained a VITS model in the model.safetensors format. While I found documentation on how to export the MMS model to Sherpa-ONNX (https://k2-fsa.github.io/sherpa/onnx/tts/mms.html), I couldn't find information on how to export this specific TTS model to Sherpa ONNX.
Could you please provide guidance or steps on how to achieve this?
Any updates?