llm-optimization
Make LLM inference faster with chunk-level KV cache reuse
ContextCore: An MCP server for Claude (or any AI tool) that enables massive token saving through hybrid search (BM25 + Embeddings)
Nadir is a Python package designed to dynamically choose the best llm for your prompt by balancing complexity and cost and response time.
build unstructured to structured data transformation pipelines
Superpipe - optimized LLM pipelines for structured data