PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Llamacpp Python Packages

Python packages with the GitHub topic llamacpp. Sorted by relevance, with stars and monthly downloads.
JohnSnowLabs
spark-nlp

State of the Art Natural Language Processing

1.1M 4K 743
xorbitsai
xinference

Swap GPT for any LLM by changing a single line of code. Xinference lets you run open-source, speech, and multimodal models on cloud, on-prem, or your laptop — all through one unified, production-ready inference API.

41K 9K 824
khoj-ai
khoj

Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI (gpt, claude, gemini, llama, qwen, mistral). Get started - free.

37K 35K 2K
gptme
gptme

Your agent in your terminal, equipped with local tools: writes code, uses the terminal, browses the web. Make your own persistent autonomous agent on top!

29K 4K 387
khoj-ai
khoj-assistant

Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI (gpt, claude, gemini, llama, qwen, mistral). Get started - free.

18K 35K 2K
jjang-ai
jang

JANG — GGUF for MLX. YOU MUST USE JANG_Q RUNTIME. Adaptive Mixed-Precision Quantization + Runtime for Apple Silicon

13K 160 22
Maximilian-Winter
llama-cpp-agent

The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM models, execute structured function calls and get structured output. Works also with models not fine-tuned to JSON output and function calls.

11K 632 70
containers
ramalama

RamaLama is an open-source developer tool that simplifies the local serving of AI models from any source and facilitates their use for inference in production, all through the familiar language of containers.

11K 3K 338
OEvortex
webscout

Webscout is the all-in-one search and AI toolkit you need. Discover insights with Yep.com, DuckDuckGo, and Phind; access cutting-edge AI models; transcribe YouTube videos; generate temporary emails and phone numbers; perform text-to-speech conversions; and much more!

9K 345 65
abdeladim-s
pyllamacpp

Python bindings for llama.cpp

9K 68 24
eliranwong
toolmate

ToolMate AI, developed by Eliran Wong, is a cutting-edge AI companion that seamlessly integrates agents, tools, and plugins to excel in conversations, generative work, and task execution. Supports custom workflow and plugins to automate multi-step actions.

7K 178 23
llmware-ai
llmware

Unified framework for building enterprise RAG pipelines with small, specialized models

5K 15K 3K
luongnv89
claude-codex-local

Hit your limit? Need privacy? Just swap the model, everything else stays

4K 23 1
ddh0
easy-llama

Python package wrapping llama.cpp for on-device LLM inference

3K 105 7
TAO71-AI
i4-0-client-py

Fully modular AI server and client

3K 2 0
eliranwong
toolmate-lite

ToolMate AI, developed by Eliran Wong, is a cutting-edge AI companion that seamlessly integrates agents, tools, and plugins to excel in conversations, generative work, and task execution. Supports custom workflow and plugins to automate multi-step actions.

3K 178 23
eliranwong
toolmate-android

ToolMate AI, developed by Eliran Wong, is a cutting-edge AI companion that seamlessly integrates agents, tools, and plugins to excel in conversations, generative work, and task execution. Supports custom workflow and plugins to automate multi-step actions.

2K 178 23
Freed-Wu
translate-shell

Translate text by google, bing, youdaozhiyun, haici, stardict, openai, large language model of local machine, etc at same time from CLI, GUI (GNU/Linux, Android, macOS and Windows), REPL, python, shell and vim.

2K 52 3
corefrg
lexicont

Lightweight LLM agent for text moderation with deterministic pipeline: profanity rules → ML → LLM. Production-grade, YAML configurable. Uses glin-profanity + rapidfuzz + detoxify with Qwen LLM agent (llamacpp/Ollama) + RAG. Early-stop logic. Pure Python, CLI/API/SDK.

2K 1 1
billbillbilly
urban-worm

Workflow of reproducible multimodal inference for urban environment evaluation.

2K 5 4
BrunoArsioli
llama-optimus

Lightweight Python tool using Optuna for tuning llama.cpp flags: towards optimal tok/s for your machine

2K 33 6
mirpo
fastapi-gen

Build LLM-enabled FastAPI applications without build configuration.

1K 12 1
kyegomez
exxa

Exa - Pytorch

1K 26 4
luo-anthony
developergpt

DeveloperGPT is a LLM-powered command line tool that enables natural language to terminal commands and in-terminal chat.

1K 44 5
    • Data from PyPI, GitHub, ClickHouse, and BigQuery