Ttft Python Packages

llm-autotune

Zero-config local LLM optimization for Ollama, LM Studio, and Apple Silicon MLX. Reduces TTFT by 40%, wall time for local agents by 46%, and RAM usage by 3x.

3K 31 3

voice-budget

The only voice agent context manager with a TTFT feedback loop

368 3 0

infermark

Benchmark any OpenAI-compatible LLM endpoint. TTFT, inter-token latency, throughput, P50-P99 — in one command.

299 4 2