PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Llama Cpp Python Packages

Python packages with the GitHub topic llama-cpp. Sorted by relevance, with stars and monthly downloads.
shakfu
cyllama

A thin cython wrapper around llama.cpp, whisper.cpp and stable-diffusion.cpp

22K 25 20
shakfu
cyllama-cuda12

A thin cython wrapper around llama.cpp, whisper.cpp and stable-diffusion.cpp

16K 25 20
shakfu
cyllama-vulkan

A thin cython wrapper around llama.cpp, whisper.cpp and stable-diffusion.cpp

15K 25 20
shakfu
cyllama-sycl

A thin cython wrapper around llama.cpp, whisper.cpp and stable-diffusion.cpp

12K 25 20
shakfu
cyllama-rocm

A thin cython wrapper around llama.cpp, whisper.cpp and stable-diffusion.cpp

10K 25 20
rohitgarg19
opencode-llmstack

Cursor-Auto / Claude-tier-style serving for local GGUF models on Mac (M4 Max, 64 GB). FastAPI router fronts llama-swap + llama.cpp, classifying each request into a coder, planner, or uncensored-planner tier. OpenAI-compatible API, opencode integration, per-project subshell, one `llmstack` console-script.

7K 0 0
FarisZahrani
llama-cpp-py-sync

Auto-synced CFFI ABI python bindings for llama.cpp with prebuilt wheels (CPU/CUDA/Vulkan/Metal).

5K 4 1
antoinezambelli
forge-guardrails

A Python framework for self-hosted LLM tool-calling and multi-step agentic workflows

3K 1 0
benwalkerai
inzen-bot

Multi-provider AI chatbot for the terminal. Chat with Claude, GPT-4, Ollama or llama.cpp without leaving your shell. Full conversation history. pip install inzen-bot

2K 0 0
nrl-ai
nom-vn

Open-source Python toolkit + playground for Vietnamese AI — RAG over docs, diacritic restore, segmentation, OCR, normalization. Local-first, multi-backend (Ollama / llama.cpp / HF Transformers / OpenAI / Anthropic).

2K - -
youngharold
tightwad

Mixed-vendor GPU inference cluster manager with speculative decoding

2K 22 2
thilomichael
llama-buddy

CLI wrapper for llama.cpp providing an ollama-like experience

2K 8 0
furuse-kazufumi
llmesh-mcp

Security-first local LLM swarm over MCP with signed peer discovery, fail-closed validation, audit trails, and Docker-based PoC nodes.

2K 0 0
BenevolentJoker-JohnL
sollol

Super Ollama Load Balancer - Performance-aware routing for distributed Ollama deployments with Ray, Dask, and adaptive metrics

2K 4 2
nrl-ai
edgevox

Offline voice agent framework for robots.

1K 6 0
Anyesh
cognitive-cache

Optimal context orchestration for LLMs

1K 2 0
mycellm
mycellm

Distributed LLM inference across heterogeneous hardware

1K 4 1
shakfu
inferna

an early-stage experimental nanobind wrapper around llama.cpp

994 0 0
milika
egovault

Local-first personal data vault — ingest your emails, files & messages, enrich with a local LLM (llama.cpp), and chat with your own data via hybrid RAG (BM25 + vectors + HyDE). No cloud, no tracking, everything in one SQLite file.

953 0 0
fajknli
palacelite

轻量级本地 AI 记忆系统

866 3 0
rafaelpierre
openai-agents-redis

Session management for OpenAI Agents SDK using Redis.

841 16 4
e-lab
grammarflow

Ensuring parsability of LLM responses in agent chains

830 9 0
nuance1979
llama-server

LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.

817 135 14
notolog
notolog

Notolog - Python Markdown Editor

711 25 6
    • Data from PyPI, GitHub, ClickHouse, and BigQuery