Is this the best punctuator at the moment or are there better ones?
I am using bert-restore-punctuation for fixing the transcription generated by whisper for my youtube channel videos
My youtube channel (technology, education and programming) : https://www.youtube.com/SECourses
Whisper starts to lose ability to punctuate in some cases i don't know why but then it requires to fix punctuation otherwise it is very bad as a good subtitle
So if there are any better alternative punctuator atm that works better than felflare/bert-restore-punctuation could anyone let me know?
And this is my how to use whisper video if anyone is interested in : https://youtu.be/msj3wuYf3d8
Have you found an alternative?
Have you found an alternative?
I am still using same
@MonsterMMORPG
your videos e.g. https://www.youtube.com/watch?v=dpM02YMj8FY seem to now contain punctuations.
Was it done by YouTube automatically or did you rely on tools like ChatGPT or these online HuggingFace punctuators?
Thanks!
@MonsterMMORPG your videos e.g. https://www.youtube.com/watch?v=dpM02YMj8FY seem to now contain punctuations.
Was it done by YouTube automatically or did you rely on tools like ChatGPT or these online HuggingFace punctuators?Thanks!
I follow 4 steps
first transcribe with whisper
then punctuate with felflare/bert-restore-punctuation
then fix capital letters
then manually fix whisper transcribe errors manually
That's a lot of work.
I'm thinking about converting a good punctuator to ONNX and make a simple web page that lets you enter a text or YouTube link, and have it automatically run the network over the existing raw transcript.
Would that be useful for you?
Also I found felflare/bert-restore-punctuation
not to be as good as https://huggingface.co/unikei/distilbert-base-re-punctuate
Plus, I wasn't able to convert felflare/bert-restore-punctuation
to ONNX, but unikei/distilbert-base-re-punctuate works in ONNX.
Have you tried unikei/distilbert-base-re-punctuate
?
That's a lot of work.
I'm thinking about converting a good punctuator to ONNX and make a simple web page that lets you enter a text or YouTube link, and have it automatically run the network over the existing raw transcript.
Would that be useful for you?Also I found
felflare/bert-restore-punctuation
not to be as good as https://huggingface.co/unikei/distilbert-base-re-punctuatePlys, I wasn't able to convert
felflare/bert-restore-punctuation
to ONNX, but unikei/distilbert-base-re-punctuate works in ONNX.Have you tried
unikei/distilbert-base-re-punctuate
?
actually only last part is manual
i haven't tested this yet : unikei/distilbert-base-re-punctuate
nice ty
@MonsterMMORPG
I created a simple web page where you can try punctuating any youtube video.
Let me know if you have time to try and what the results look like for you.
https://www.appblit.com/scribe
Laurent
@MonsterMMORPG
I created a simple web page where you can try punctuating any youtube video.
Let me know if you have time to try and what the results look like for you.
https://www.appblit.com/scribeLaurent
working decent
i tested on this video : https://youtu.be/PNA9p94JmtY
Thanks for testing it!
Now, it also streams the results as they come (chunk by chunk before Distilbert accepts a max 512 token inputs), and works on any text length.
I would like to add paragraph breaks: do you know of a model that performs this task?
Also, would automatic summary and chapters be useful to you? I saw several videos that has chapters with links to facilitate navigation.
Or is that already provided by YouTube?
I'm @ldenoue on Twitter by the way
Laurent
Thanks for testing it!
Now, it also streams the results as they come (chunk by chunk before Distilbert accepts a max 512 token inputs), and works on any text length.I would like to add paragraph breaks: do you know of a model that performs this task?
Also, would automatic summary and chapters be useful to you? I saw several videos that has chapters with links to facilitate navigation.
Or is that already provided by YouTube?I'm @ldenoue on Twitter by the way
Laurent
I add chapters manually
I am also dividing it the paragraphs myself with my app. but if youtube can auto synch i prefer it much better
i followed you on twitter this is mine > https://twitter.com/GozukaraFurkan
i made a comparison between my last used and new unikei/distilbert-base-re-punctuate
here results
https://twitter.com/GozukaraFurkan/status/1715045585324003358
very good analysis. How did you handle the long texts?
very good analysis. How did you handle the long texts?
i load from text file in a python app if you mean that
i used their example code that put on readme. they are amazing :)