pyannote.audio

non-profit

https://github.com/pyannote/pyannote-audio

pyannote

AI & ML interests

speaker diarization // speaker recognition // speaker segmentation // voice activity detection // overlapped speech detection // speaker change detection

Organization Card

Community About org cards

pyannote.audio is an open-source toolkit for speaker diarization.

Pretrained pipelines reach state-of-the-art performance on most academic benchmarks.

Using it in production?
Consider switching to pyannoteAI for better and faster options.

Benchmark	v2.1	v3.1	pyannoteAI
AISHELL-4	14.1	12.2	11.2
AliMeeting (channel 1)	27.4	24.4	19.3
AMI (IHM)	18.9	18.8	15.8
AMI (SDM)	27.1	22.4	19.3
AVA-AVD	66.3	50.0	44.8
CALLHOME (part 2)	31.6	28.4	19.8
DIHARD 3 (full)	26.9	21.7	16.8
Earnings21	17.0	9.4	9.1
Ego4D (dev.)	61.5	51.2	44.0
MSDWild	32.8	25.3	19.8
RAMC	22.5	22.2	11.1
REPERE (phase2)	8.2	7.8	7.6
VoxConverse (v0.3)	11.2	11.3	9.8
Diarization error rate (in %)

Using high-end NVIDIA hardware,

v2.1 takes around 1m30s to process 1h of audio
v3.1 takes around 1m20s to process 1h of audio
On-premise pyannoteAI takes less than 30s to process 1h of audio

spaces 1

Pretrained pipelines

models 15

pyannote/speech-separation-ami-1.0

Updated 10 days ago • 65.7k • 40

pyannote/separation-ami-1.0

Updated Jul 16 • 7

pyannote/speaker-diarization-3.1

Automatic Speech Recognition • Updated May 10 • 9.97M • 554

pyannote/overlapped-speech-detection

Automatic Speech Recognition • Updated May 10 • 33.4k • 29

pyannote/speaker-segmentation

Automatic Speech Recognition • Updated May 10 • 130 • 29

pyannote/voice-activity-detection

Automatic Speech Recognition • Updated May 10 • 852k • 163

pyannote/segmentation

Voice Activity Detection • Updated May 10 • 6.96M • 495

pyannote/speaker-diarization

Automatic Speech Recognition • Updated May 10 • 5.85M • 864

pyannote/speaker-diarization-3.0

Automatic Speech Recognition • Updated May 10 • 2.63M • 171

pyannote/embedding

Updated May 10 • 1.13M • 118

datasets

None public yet