Diarization Python Packages

spy-der

Simple Python package for fast DER computation

17K 35 7

diarize

Speaker diarization for Python — "who spoke when?" CPU-only, no API keys, Apache 2.0. ~10.8% DER on VoxConverse, 8x faster than real-time.

3K 91 8

polyvoice

Speaker diarization for Rust — who spoke when. ONNX-powered: Silero VAD, WeSpeaker embeddings, Pyannote segmentation, K-means/AHC clustering, overlap detection. Python bindings & CLI included.

3K 3 1

whisper-smith

CLI and Python library for transcribing audio with OpenAI Whisper — supports txt, json, srt, vtt output and optional speaker diarization.

2K 0 0

senko

Very fast speaker diarization

1K 263 29

yapsnap

Snap any video URL or local audio/video into a plaintext transcript. CPU-first, offline, single command.

1K 282 11

dinnote

Audio denoising, VAD, speaker diarization, and transcription pipeline using Demucs, Silero VAD, pyannote, and Whisper

1K 1 0

murmurai-core

🎙️ Drop-in replacement for paid transcription APIs. Self-hosted, GPU-powered, speaker diarization. Free forever: uvx murmurai

737 42 17

resilient-stt

The only Speech-To-Text pipeline you need

565 1 1

whisper-speaker-id

Whisper Speaker Identification (WSI), a cutting-edge model for multilingual speaker identification.

549 26 1

whisperx-api

🎙️ Drop-in replacement for paid transcription APIs. Self-hosted, GPU-powered, speaker diarization. Free forever: uvx murmurai

535 42 17

webinar-transcriber

Local-first Python CLI that turns webinar audio and slide videos into transcripts, reports, scenes, diagnostics, and optional LLM-polished notes.

519 1 0

murmurai

🎙️ Drop-in replacement for paid transcription APIs. Self-hosted, GPU-powered, speaker diarization. Free forever: uvx murmurai

510 42 17

pvfalcon

On-device speaker diarization powered by deep learning

498 74 7

dlzoom

Download Zoom cloud recordings from the command line - M4A audio + STJ diarization JSON, built for custom transcription pipelines with Whisper and other ASR tools

439 0 0

dover-lap

Python package for combining diarization system outputs.

433 94 12

simpleder

A lightweight library to compute Diarization Error Rate (DER).

350 62 9

voicetag

Speaker identification powered by pyannote and resemblyzer

334 51 5

pvfalcondemo

On-device speaker diarization powered by deep learning

301 74 7

p3x-meet-assistant

Real-time AI speech-to-text for meetings with GPT-4o Transcribe and GPU speaker diarization

281 0 0

wavlmmsdd

This repository combines `WavLM`, a powerful speech representation model from Microsoft, with `MSDD` (Multi-Scale Diarization Decoder), a state-of-the-art approach for speaker diarization from Nvidia.

177 12 3

pafts

PAFTS : Library That Preprocessing Audio For TTS.

163 27 5

rtrimmer

Lightweight Python package to trim RTTM diarization files and audio files

98 1 0

opensono

Open-source audio transcription with speaker diarization

50 0 0