unified-memory
Local inference server for Apple Silicon that hot-swaps MLX models (LLM, vision, embeddings, TTS, STT) via OpenAI-compatible API