PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Speech Processing Python Packages

Python packages with the GitHub topic speech-processing. Sorted by relevance, with stars and monthly downloads.
snakers4
silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

829K 9K 771
linto-ai
whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

100K 3K 211
microsoft
torchscale

Foundation Architecture for (M)LLMs

73K 3K 225
sutariyaraj
indic-num2words

Python library for converting numbers to words for all Indian Languages.

24K 36 14
SuperKogito
spafe

:sound: spafe: Simplified Python Audio Features Extraction

19K 483 77
r9y9
pysptk

A python wrapper for Speech Signal Processing Toolkit (SPTK).

15K 451 80
lars76
swift-f0

Fast and accurate fundamental frequency (F0) detector using convolutional neural networks

9K 159 21
resemble-ai
resemble-enhance

AI powered speech denoising and enhancement

6K 2K 278
haoheliu
voicefixer

General Speech Restoration

4K 1K 158
haoxiangsnr
audioinfo

A small tool to calculate the distribution of audio durations in a directory

4K 14 1
r9y9
nnmnkwii

Library to build speech synthesis systems designed for easy and fast prototyping.

4K 399 71
daanzu
silero-vad-lite

Lightweight wrapper for Silero VAD using internal ONNX Runtime and with no python package dependencies

3K 18 1
FoxNoseTech
diarize

Speaker diarization for Python — "who spoke when?" CPU-only, no API keys, Apache 2.0. ~10.8% DER on VoxConverse, 8x faster than real-time.

3K 67 7
vocalpy
vak

A neural network framework for researchers studying acoustic communication

2K 91 17
tabahi
bournemouth-forced-aligner

Extract phoneme-level timestamps from speeh audio.

2K 132 14
magcil
deepaudio-x

A python library to train Deep Neural Networks on various audio tasks using Self-Supervised backbones.

2K 32 1
EveryVoiceTTS
everyvoice

The EveryVoice TTS Toolkit - Text To Speech for your language

2K 44 4
Ilyushin
signal-transformation

Widely used signal transformation using TensorFlow API.

1K 1 0
Shahabks
myspokenlanguagedetection

Spoken language identification with CNN and RNN - Improved Version: accuracy up

891 3 3
MontrealCorpusTools
polyglotdb

No description available

878 51 17
alessandroragano
scoreq

SCOREQ: Speech COntrastive REgression for Quality Assessment (NeurIPS 2024)

863 110 8
takenori-y
lfeats

A unified interface to extract hidden representations from various speech foundation models

701 1 0
wq2012
simpleder

A lightweight library to compute Diarization Error Rate (DER).

698 62 9
tann9949
vistec-ser

Speech Emotion Recognition models and training using PyTorch

681 3 2
    • Data from PyPI, GitHub, ClickHouse, and BigQuery