May I ask how could I use pipeline and output the transcript as SRT with timestamp?

#9
by ziweithunder - opened

Thanks for fine tuning the Cantonese model. May I ask how could I use pipeline and output the transcript as SRT with timestamp? I could run the audio file and output a paragraph of Cantonese, but couldn't find the way to output like SRT format using whisper.

I usually use WhisperX to get word/sentence-level timestamps. I am not sure exactly how to export it to SRT files but I have now uploaded the CTS version of this model so you dont need to convert it

Thanks for your reply. I am a beginner in using whisper and your model. May I ask what does CTS version mean and stand for?
do you mean I can use whisperx.load_model("alvanlii/whisper-small-cantonese") to load this model file directly instead of using pipeline method?
Thanks.

hmmm it might be easier for you to use this one

Sign up or log in to comment