May I ask how could I use pipeline and output the transcript as SRT with timestamp?

by ziweithunder - opened 9 days ago

9 days ago

Thanks for fine tuning the Cantonese model. May I ask how could I use pipeline and output the transcript as SRT with timestamp? I could run the audio file and output a paragraph of Cantonese, but couldn't find the way to output like SRT format using whisper.

alvanlii

Owner 9 days ago

I usually use WhisperX to get word/sentence-level timestamps. I am not sure exactly how to export it to SRT files but I have now uploaded the CTS version of this model so you dont need to convert it

ziweithunder

9 days ago

•

edited 9 days ago

Thanks for your reply. I am a beginner in using whisper and your model. May I ask what does CTS version mean and stand for?
do you mean I can use whisperx.load_model("alvanlii/whisper-small-cantonese") to load this model file directly instead of using pipeline method?
Thanks.

alvanlii

Owner 8 days ago

•

edited 8 days ago

hmmm it might be easier for you to use this one

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment