Kimi Python Packages

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

5.8M 86K 19K

tokenspeed-mla

TokenSpeed is a speed-of-light LLM inference engine.

1.5M 2K 185

claude-tap

Intercept and inspect Coding Agent API traffic from Claude Code, Codex CLI, Gemini CLI, Cursor CLI, OpenCode, Kimi/Kimi Code, Pi, and Hermes in a local trace viewer.

109K 2K 219

vllm-tpu

A high-throughput and memory-efficient inference and serving engine for LLMs

58K 86K 19K

tokenspeed-smg

TokenSpeed is a speed-of-light LLM inference engine.

23K 2K 185

vllm-cpu-nightly

A high-throughput and memory-efficient inference and serving engine for LLMs

16K 86K 19K

agent6

A coding agent that jails model commands and uses editable state machines for long-running tasks

7K 1 0

illusion-code

AI-powered CLI coding assistant with Claude Code heritage, deep Windows optimization, bilingual UI,comprehensive Markdown rendering.

7K 7 3

coding-proxy

A High-Availability, Transparent, and Smart Multi-Vendor Proxy for Claude Code. Support Claude Plans, GitHub Copilot, Google Antigravity, ZAI/GLM, MiniMax, Qwen, Xiaomi, Kimi, Doubao...

6K 18 2

kitty-bridge

Universal LLM bridge for AI agents. Use Claude Code with MiniMax, Codex with GLM, or Gemini CLI with OpenRouter — one command, any provider. Works with coding agents, OpenClaw, Hermes, and others.

4K 16 4

harnessnovel

长篇网络小说自动化写作 AI Agent，支持Chatgpt、gemini、Deepseek等国内外模型接入，采用拆书 + 仿写的模式，从新世界观、大纲、卷纲设计，到章节片段完善&章纲设计，有效解决 AI 写作中的「前后矛盾」、「容易遗忘」、「缺乏特色」和「AI味重」等问题，支持百万字量级连载创作。

2K 40 7

kimi-code-usage

Kimi (Moonshot AI) Coding Plan API usage quota tracker — CLI + MCP Server + VSCode Extension

2K 4 0

tokenspeed-kernel-amd

TokenSpeed AMD-specific high-performance kernels.

1K 2K 185

ai-dynamo-vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

606 86K 19K

llm-onesdk

OneSDK is a Python library that provides a unified interface for interacting with various Large Language Model (LLM) providers.

605 2 0

vllm-acc

A high-throughput and memory-efficient inference and serving engine for LLMs

575 86K 19K

vllm-xft

A high-throughput and memory-efficient inference and serving engine for LLMs

506 86K 19K

vllm-musa

vLLM platform plugin for Moore Threads MUSA GPUs

426 86K 19K

vllm-consul

A high-throughput and memory-efficient inference and serving engine for LLMs

407 86K 19K

vllm-npu

A high-throughput and memory-efficient inference and serving engine for LLMs

401 86K 19K

nextai-vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

399 86K 19K

devorch

Multi-provider AI coding assistant CLI with 13+ providers

354 4 0

vllm-hust

A high-throughput and memory-efficient inference and serving engine for LLMs

312 86K 19K

kimi4free

A way of using Kimi for free; even with login features.

307 4 1