semantic-caching
Semantic memory and caching for LLM agents with classifier-validated equivalence instead of naive cosine thresholds.
Intelligent LLM agent cost optimization runtime — per-run budget enforcement, semantic caching, model routing across 15 providers
Ultra-low-latency LLM gateway with microsecond caching, dynamic routing, budgets, analytics, and forecasting.
Python library for the Semcache API