PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Apple Silicon Python Packages

Python packages with the GitHub topic apple-silicon. Sorted by relevance, with stars and monthly downloads.
Blaizzy
mlx-vlm

MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.

383K 5K 539
Blaizzy
mlx-audio

A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.

186K 7K 594
bithuman-product
bithuman

Portable C++ avatar runtime — Python bindings via pybind11. Powers the bitHuman Essence pipeline cross-platform.

127K - -
raullenchai
rapid-mlx

The fastest local AI engine for Apple Silicon. 4.2x faster than Ollama, 0.08s cached TTFT, 100% tool calling. 17 tool parsers, prompt cache, reasoning separation, cloud routing. Drop-in OpenAI replacement. Works with Claude Code, Cursor, Aider.

48K 2K 287
filipstrand
mflux

MLX native implementations of state-of-the-art generative image models

36K 2K 143
Andyyyy64
whichllm

Find the local LLM that actually runs and performs best on your hardware. Ranked by real, recency-aware benchmarks, not parameter count. One command, run it instantly.

23K 897 34
cubist38
mlx-openai-server

A high-performance API server that provides OpenAI-compatible endpoints for MLX models. Developed using Python and powered by the FastAPI framework, it provides an efficient, scalable, and user-friendly solution for running MLX-based vision and language models locally with an OpenAI-compatible interface.

18K 336 61
youssofal
mtplx

2.24x decode TPS increase On Qwen 3.6 27B @ temp 0.6 | Native MTP Speculative Decoding On Apple Silicon With No External Drafter.

16K 492 19
jjang-ai
jang

JANG — GGUF for MLX. YOU MUST USE JANG_Q RUNTIME. Adaptive Mixed-Precision Quantization + Runtime for Apple Silicon

13K 160 22
ARahim3
mlx-tune

Fine-tune LLMs on your Mac with Apple Silicon. SFT, DPO, GRPO, Vision, TTS, STT, Embedding, and OCR fine-tuning — natively on MLX. Unsloth-compatible API.

9K 1K 80
tanavc1
llm-autotune

Zero-config local LLM optimization for Ollama, LM Studio, and Apple Silicon MLX. Reduces TTFT by 40%, wall time for local agents by 46%, and RAM usage by 3x.

8K 25 1
tlkh
asitop

Perf monitoring CLI tool for Apple Silicon

7K 5K 208
tillahoffmann
jax-mps

A JAX backend for Apple Metal Performance Shaders (MPS), enabling GPU-accelerated JAX computations on Apple Silicon.

6K 132 13
rohitgarg19
opencode-llmstack

Cursor-Auto / Claude-tier-style serving for local GGUF models on Mac (M4 Max, 64 GB). FastAPI router fronts llama-swap + llama.cpp, classifying each request into a coder, planner, or uncensored-planner tier. OpenAI-compatible API, opencode integration, per-project subshell, one `llmstack` console-script.

6K 0 0
arizawan
vidlizer

Point it at a video, image, or PDF — get structured JSON. Runs local (Ollama, LM Studio, oMLX) or cloud (OpenRouter). CLI + MCP server for Claude Code, Cursor, and Claude Desktop.

6K 1 1
hellobertrand
zxc-compress

High-performance asymmetric lossless compression. 40%+ faster decompression than LZ4 on ARM64 with better compression ratios. Optimized for Game Assets, Firmware & App Bundles.

5K 374 7
wst24365888
libstreamvbyte

A C++ implementation of StreamVByte, with Python bindings.

5K 10 1
DarshanFofadiya
sparsecore

Actually-sparse dynamic training for PyTorch. CPU-native, Apple Silicon first. Pluggable routers, drop-in SparseLinear.

3K 9 2
DarshanFofadiya
sparselab

Actually-sparse dynamic training for PyTorch. CPU-native, Apple Silicon first. Pluggable routers, drop-in SparseLinear.

3K 9 2
manjunathshiva
turboquant-mlx-full

Extreme weight + KV cache compression for LLMs on Apple Silicon (MLX implementation of Google's TurboQuant)

3K 23 3
geeks-accelerator
ollama-herd

Local AI load balancer for Ollama fleets — auto-discovery, smart routing, OpenAI-compatible API, zero config. Perfect for Mac Minis & Studios.

2K 7 0
dualform-labs
m5-infer

Extraordinary speed, extraordinary quality — MLX-based inference engine for Apple Silicon.

2K 0 1
mordechaipotash
brain-mcp

Your AI has amnesia. Persistent memory and cognitive context for AI. 25 MCP tools. 12ms recall.

2K 52 13
druide67
asiai

Multi-engine LLM benchmark & monitoring CLI for Apple Silicon

2K 7 2
    • Data from PyPI, GitHub, ClickHouse, and BigQuery