timestamp decoding
#9
by
StephennFernandes
- opened
Hi there, is there a way to let mms have timestamp decoding similar to openai whisper models ?
Yep, easiest done with the pipeline
. For character level timestamps:
from transformers import pipeline
transcriber = pipeline(model="facebook/mms-1b-all")
transcriber("https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/1.flac", return_timestamps="char")
For word-level timestamps:
from transformers import pipeline
transcriber = pipeline(model="facebook/mms-1b-all")
transcriber("https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/1.flac", return_timestamps="word")
See docs for more details: https://huggingface.co/docs/transformers/main/en/main_classes/pipelines#transformers.AutomaticSpeechRecognitionPipeline
hey @sanchit-gandhi thanks a ton for taking the time to reply.
could you please tell me how i could do this in the regular inference mode as well
eg:
inputs = processor(en_sample, sampling_rate=16_000, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs).logits
ids = torch.argmax(outputs, dim=-1)[0]
transcription = processor.decode(ids)