PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Turboquant Python Packages

Python packages with the GitHub topic turboquant. Sorted by relevance, with stars and monthly downloads.
quantumaikr
quantcpp

LLM inference with 7x longer context. Pure C, zero dependencies. Lossless KV cache compression + single-header library.

8K 387 42
RyanCodrai
turbovec

A vector index built on TurboQuant, written in Rust with Python bindings

6K 984 83
Alberto-Codes
turboquant-vllm

TurboQuant KV cache compression plugin for vLLM — asymmetric K/V, 8 models validated, consumer GPUs

4K 48 5
manjunathshiva
turboquant-mlx-full

Extreme weight + KV cache compression for LLMs on Apple Silicon (MLX implementation of Google's TurboQuant)

2K 23 3
back2matching
turboquant

First open-source TurboQuant KV cache compression for LLM inference. Drop-in for HuggingFace. pip install turboquant.

2K 36 7
back2matching
turboquant-vectors

Compress embeddings 6x instantly with TurboQuant. First pip package using Google's TurboQuant (ICLR 2026) for vector search. 71.9% recall vs FAISS PQ 13.3%.

2K 1 1
AlphaWaveSystems
tqai

TurboQuant KV cache compression for local LLM inference

800 1 0
ilyajob05
turboquant-space

SIMD-accelerated 4/8-bit vector quantization for approximate nearest neighbor search, based on TurboQuant (ICLR 2026). Standalone C++17 library with Python bindings

563 6 0
Argonaut790
fused-turboquant

Fused Triton kernels for TurboQuant KV cache compression — 2-4 bit quantization with RHT rotation. Drop-in HuggingFace & vLLM integration. Up to 4.9x KV cache compression for Llama, Qwen, Mistral, and more.

449 8 1
vivekvar-dl
turbokv

First open-source implementation of TurboQuant (arXiv 2504.19874) — 4-7x LLM KV cache compression. pip install turbokv

340 0 0
singhsidhukuldeep
turboquant-hf

Near-optimal weight quantization for LLMs using the TurboQuant algorithm

295 0 0
wjddusrb03
langchain-turboquant

LangChain VectorStore with TurboQuant compression (ICLR 2026) - 6x memory reduction, training-free, no GPU required. The first LangChain integration for Google Research's TurboQuant algorithm.

202 1 2
vivekvar-dl
turboquant-impl

First open-source implementation of TurboQuant (arXiv 2504.19874) — 4-7x LLM KV cache compression. pip install turbokv

133 0 0
wjddusrb03
commitmind

CommitMind: Semantic search for Git commit history powered by TurboQuant vector compression (ICLR 2026). Search commits by meaning, not just keywords.

133 0 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery