How do you finetune mms-1b on Hugging Face?

by KaleSalad - opened Jun 10, 2023

Jun 10, 2023

•

edited Jun 10, 2023

I am trying to use the same pipeline I used to fine-tune facebook/wav2vec2-xls-r-300m to train this model (facebook/mms-1b) since both have the same model class, Wav2Vec2ForCTC. However, I am running into this error in the backward pass step when I try to fine-tune:

How do I fine-tune mms-1b? Thank you.

patrickvonplaten

Jun 14, 2023

See: https://github.com/huggingface/transformers/pull/24281

sanchit-gandhi

Jun 20, 2023

And https://huggingface.co/blog/mms_adapters

KaleSalad

Jun 21, 2023

Thank you for adding these resources!

KaleSalad changed discussion status to closed Jun 21, 2023

KaleSalad

Jun 22, 2023

•

edited Jun 22, 2023

And https://huggingface.co/blog/mms_adapters

@sanchit-gandhi Why is the mms-1b-all model in the blog finetuned on Common Voice? I thought the mms-1b-all has already been fine-tuned on Common Voice (along with MMS-lab + FLEURS + VP + MLS ). Does this have to do with the language-specific adapter weights?

KaleSalad changed discussion status to open Jun 22, 2023

Ranjit

Jun 28, 2023

How can I add a different language to MMS?

And https://huggingface.co/blog/mms_adapters

Here only the tokenizer is taking the target-lang argument, so, if a language is not there on MMS, can we get the vocab json file of the new language and start training the adapters? Will that going to add the new language to the model?

@patrickvonplaten @sanchit-gandhi

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment