Spaces:
Running
on
L4
Running
on
L4
Speaker identification
#4
by
Dehma
- opened
Is it possible to get which speaker is talking by timestamp? Like :
00:00 -> 00:10 : [Person1] Hello my name is …
00:10 -> 00:15: [Person2] Nice to meet you …
00:15 -> 00:20: [Person1] What is your …
Or at least just mentioning when the speaker is changing?
I think diarization is not yet updated
These links may be helpful:
- Transcription and diarization (speaker identification): https://github.com/openai/whisper/discussions/264
- Whisper's transcription plus Pyannote's Diarization: https://github.com/Majdoddin/nlp
- Neural speaker diarization with pyannote.audio: https://github.com/pyannote/pyannote-audio
- Whisper's transcription plus Pyannote's Diarization: https://github.com/Majdoddin/nlp
- Speaker diarization (partitioning audio based on speaker identity): https://github.com/openai/whisper/discussions/104
Working Google Colab project I put together: https://colab.research.google.com/drive/11ccdRYZSHBbUYI9grn6S1O67FVyvCKyL
Please note that you first transform the audio to MONO!
How to is in the description.