fleets
KV-cache-aware intelligent routing for self-hosted and hybrid LLM fleets. Route requests using model quality, latency, cost, policy, and live GPU state.