Spaces:

openai
/

whisper

Running on L4

App Files Files Community

125

Speaker identification

by Dehma - opened Sep 23, 2022

Discussion

Dehma

Sep 23, 2022

Is it possible to get which speaker is talking by timestamp? Like :
00:00 -> 00:10 : [Person1] Hello my name is …
00:10 -> 00:15: [Person2] Nice to meet you …
00:15 -> 00:20: [Person1] What is your …

Or at least just mentioning when the speaker is changing?

Lagstill

Sep 23, 2022

I think diarization is not yet updated

devalias

Nov 9, 2022

These links may be helpful:

Transcription and diarization (speaker identification): https://github.com/openai/whisper/discussions/264
- Whisper's transcription plus Pyannote's Diarization: https://github.com/Majdoddin/nlp
  - Neural speaker diarization with pyannote.audio: https://github.com/pyannote/pyannote-audio
Speaker diarization (partitioning audio based on speaker identity): https://github.com/openai/whisper/discussions/104

Z4Y

Jun 15, 2023

Working Google Colab project I put together: https://colab.research.google.com/drive/11ccdRYZSHBbUYI9grn6S1O67FVyvCKyL
Please note that you first transform the audio to MONO!
How to is in the description.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment