self-hosted-ai
Local AI load balancer for Ollama fleets — auto-discovery, smart routing, OpenAI-compatible API, zero config. Perfect for Mac Minis & Studios.
Mixed-vendor GPU inference cluster manager with speculative decoding
Small Language Model Inference, Fine-Tuning, Evaluation and Observability.
Self-hosted AI security proxy. Redact PII, block prompt injection, route to any LLM provider. OpenAI-compatible.