PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Speech To Text Python Packages

Python packages with the GitHub topic speech-to-text. Sorted by relevance, with stars and monthly downloads.
Uberi
speechrecognition

Speech recognition module for Python, supporting several engines and APIs, online and offline.

7.9M 9K 2K
SYSTRAN
faster-whisper

Faster Whisper transcription with CTranslate2

7.1M 23K 2K
deepgram
deepgram-sdk

Official Python SDK for Deepgram.

2.7M 430 130
m-bain
whisperx

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

1M 22K 2K
alphacep
vosk

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

571K 15K 2K
KoljaB
realtimestt

A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.

369K 10K 835
microsoft
foundry-local-sdk

Foundry Local Manager Python SDK: Control-plane SDK for Foundry Local.

364K 2K 309
k2-fsa
sherpa-onnx

Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, RK NPU, Axera NPU, Ascend NPU, x86_64 servers, websocket server/client, support 12 programming languages

258K 12K 1K
revdotcom
rev-ai

Rev AI Python SDK

254K 36 13
Blaizzy
mlx-audio

A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.

186K 7K 594
snakers4
silero

Silero Models: pre-trained text-to-speech models made embarrassingly simple

175K 6K 366
k2-fsa
sherpa-onnx-core

Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, RK NPU, Axera NPU, Ascend NPU, x86_64 servers, websocket server/client, support 12 programming languages

163K 12K 1K
linto-ai
whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

96K 3K 211
gradio-app
fastrtc

The python library for real-time communication

55K 5K 430
k2-fsa
sherpa-onnx-bin

Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, RK NPU, Axera NPU, Ascend NPU, x86_64 servers, websocket server/client, support 12 programming languages

52K 12K 1K
speechmatics
speechmatics-python

Python library and CLI for Speechmatics

52K 75 23
Softcatala
whisper-ctranslate2

Whisper command line client compatible with original OpenAI client based on CTranslate2.

40K 1K 126
istupakov
onnx-asr

A lightweight Python package for Automatic Speech Recognition using ONNX models

27K 316 30
Xewdy444
playwright-recaptcha

A Python library for solving reCAPTCHA v2 and v3 with Playwright

20K 539 67
mozilla
deepspeech

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

20K 27K 4K
Capsize-Games
airunner

Run local opensource AI models (Stable Diffusion, LLMs, TTS, STT, chatbots) in a lightweight Python GUI

13K 1K 97
analyticsinmotion
werpy

🐍📦 Ultra-fast Python package for calculating and analyzing the Word Error Rate (WER). Built for the scalable evaluation of speech and transcription accuracy.

12K 26 6
mozilla
deepspeech-gpu

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

10K 27K 4K
ARahim3
mlx-tune

Fine-tune LLMs on your Mac with Apple Silicon. SFT, DPO, GRPO, Vision, TTS, STT, Embedding, and OCR fine-tuning — natively on MLX. Unsloth-compatible API.

9K 1K 80
    • Data from PyPI, GitHub, ClickHouse, and BigQuery