Voice Cloning Python Packages

tts

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

137K 46K 6K

fish-audio-sdk

The official Python library for the Fish Audio API.

129K 182 36

voxcpm

VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice Design, and True-to-Life Cloning

100K 32K 4K

pyht

PlayHT Python SDK - AI Text-to-Speech Streaming & Voice Cloning API

64K 218 31

abstractvoice

Modular Python voice I/O for AI applications. Text-to-speech, speech-to-text, and voice cloning with local-first defaults and remote provider support. One interface across Piper, Supertonic, OpenAI, OmniVoice, and more.

5K 6 0

paddlespeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

4K 13K 2K

mlx-speech

Pure-MLX speech synthesis, voice cloning, dialogue, sound-effects, and ASR for Apple Silicon: Fish S2 Pro, VibeVoice, LongCat, MOSS, Step-Audio, Cohere ASR.

3K 29 5

styletts2

🐍 🤖 Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloning

3K 159 40

gsv-tts-lite

GSV-TTS-Lite A high-performance inference engine specifically designed for the GPT-SoVITS text-to-speech model.(few shot voice cloning)

2K 115 13

paddlespeech-feat

2K 13K 2K

paddleaudio

2K 13K 2K

speechcraft

🔊 Text2Speech, Voice-Cloning and Voice2Voice conversion with the text-prompted generative audio model bark

2K 71 9

str2speech

An easy-to-use library and command-line tool for TTS

2K 15 1

paddlespeech-ctcdecoders

1K 13K 2K

tts-webui-styletts2

🐍 🤖 Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloning

1K 159 40

genie-tts

GPT-SoVITS ONNX Inference Engine & Model Converter

1K 2K 113

vocaboot

Terminal-first singing voice synthesis framework for the Claude Code era.

833 0 0

ebook2audiobook

Convert eBooks to audiobooks with chapters and metadata

689 19K 2K

voxtream

Full-Stream Zero-shot TTS model with Extremely Low Latency and Speaking-rate Control

578 241 30

mel-cepstral-distance

A Python library for computing the Mel-Cepstral Distance (Mel-Cepstral Distortion, MCD) between two inputs. This implementation is based on the method proposed by Robert F. Kubichek in "Mel-Cepstral Distance Measure for Objective Speech Quality Assessment".

512 67 12

zerovox

zero-shot realtime TTS system, fully offline, free and open source

273 56 9

chatterbox-tts-api

REST API for ChatterboxTTS with OpenAI compatibility

189 620 146

strands-omnivoice

OmniVoice multilingual zero-shot TTS toolkit for Strands Agents — voice cloning, voice design, and 600+ language synthesis as agent tools

168 2 0

chatterbox-api

REST API for ChatterboxTTS with OpenAI compatibility

163 620 146