PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Speculative Decoding Python Packages

Python packages with the GitHub topic speculative-decoding. Sorted by relevance, with stars and monthly downloads.
youssofal
mtplx

2.24x decode TPS increase On Qwen 3.6 27B @ temp 0.6 | Native MTP Speculative Decoding On Apple Silicon With No External Drafter.

16K 492 19
Tencent
angelslim

Model compression toolkit engineered for enhanced usability, comprehensiveness, and efficiency.

7K 1K 131
aphrodite-engine
aphrodite-engine

Large-scale LLM inference engine

5K 2K 197
intel
intel-extension-for-transformers

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

5K 2K 217
youngharold
tightwad

Mixed-vendor GPU inference cluster manager with speculative decoding

2K 22 2
dualform-labs
m5-infer

Extraordinary speed, extraordinary quality — MLX-based inference engine for Apple Silicon.

2K 0 1
SafeAILab
eagle-llm

Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3 (NeurIPS'25).

414 2K 278
lpoee
opencac

Multi-agent orchestration CLI for AI coding tools. Chain Claude Code, Antigravity, and Codex with validated handoffs, JSONL audit logging, hybrid cloud/local routing, and speculative decoding for local LLMs.

118 2 0
Tencent
angelslim-fork

Model compression toolkit engineered for enhanced usability, comprehensiveness, and efficiency.

109 1K 131
llmsresearch
specstream

Fast LLM inference with 2.8x speedup using speculative decoding

65 8 1
    • Data from PyPI, GitHub, ClickHouse, and BigQuery