PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Llm Observability Python Packages

Python packages with the GitHub topic llm-observability. Sorted by relevance, with stars and monthly downloads.
pydantic
logfire

AI observability platform for production LLM and agent systems.

21.9M 4K 236
comet-ml
opik

Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.

6.2M 19K 1K
JudgmentLabs
judgeval

The Continuous-Improvement Stack for Agents. Our environment data and evals power agent improvement and monitoring.

479K 1K 93
agenta-ai
agenta

The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM observability all in one place.

51K 4K 520
comet-ml
opik-optimizer

Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.

42K 19K 1K
Helicone
helicone-helpers

🧊 Open source LLM observability platform. One line of code to monitor, evaluate, and experiment. YC W23 🍓

10K 6K 577
memodb-io
acontext

Agent Skills as a Memory Layer

6K 3K 315
Mandark-droid
genai-otel-instrument

GenAI OpenTelemetry Auto-Instrumentation Library A comprehensive wrapper for automatic instrumentation of LLM/GenAI applications Supports all major LLM providers and MCP (Model Context Protocol) tool calls

5K 1 1
syndicalt
pathlight

Visual debugging, execution traces, and observability for AI agents.

4K 21 3
agentmindsdev
agentminds

Python SDK for AgentMinds — cross-site collective intelligence for production AI agents. Auto-capture errors + push/pull patterns from the network pool.

4K - -
dunetrace
dunetrace

Real-time monitoring of your production agents. No raw content transmitted.

4K 44 4
BlazeUp-AI
observal-cli

Observal is an Observability and Evaluation platform for human-in-the-loop agents

3K 1K 139
helicone
helicone

🧊 Open source LLM observability platform. One line of code to monitor, evaluate, and experiment. YC W23 🍓

3K 6K 578
aaronlab
browsertrace

Local replay debugger for Browser Use failures with screenshots, model I/O, failed-step timelines, and public-safe HTML exports.

2K 3 23
ENDEVSOLS
longtrainer

Production-ready RAG framework for Python — multi-tenant chatbots with streaming, tool calling, agent mode (LangGraph), vector search (FAISS), and persistent MongoDB memory. Built on LangChain.

2K 28 3
sairintechnologycom
burnlens

Open-source LLM FinOps proxy — track OpenAI, Anthropic (Claude), and Google Gemini costs by feature, team, and customer. Zero code changes. pip install burnlens.

2K 2 0
comet-ml
comet-llm

Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.

2K 19K 1K
helicone
helicone-async

🧊 Open source LLM observability platform. One line of code to monitor, evaluate, and experiment. YC W23 🍓

1K 6K 578
kums1234
otel-genai-graph

Project OpenTelemetry GenAI traces into a queryable graph (Neo4j or DuckDB) — agent delegation, cost attribution, blast radius.

1K 0 0
annexkit
annexkit

EU AI Act compliance pipeline for developers — SDK, collector, Annex IV PDF generator. Open-core.

1K 1 0
smigolsmigol
llmkit-sdk

Know what your AI agents cost. API gateway with budget enforcement, session tracking, and MCP tools.

778 10 3
acailic
peaky-peek-server

Self-hosted server for Agent Debugger — FastAPI backend, SQLite/Postgres storage, SSE streaming, and React UI.

732 5 0
acailic
peaky-peek

Lightweight tracing SDK for AI agents. Capture decisions, tool calls, and LLM events with one context manager.

686 5 0
ambertrace
ambertrace

Python SDK for tracing LLM calls (OpenAI, Anthropic, Google Gemini) to AmberTrace observability platform

678 5 1
    • Data from PyPI, GitHub, ClickHouse, and BigQuery