Llm Observability Python Packages

logfire

AI observability platform for production LLM and agent systems.

19M 4K 253

opik

Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.

3.8M 20K 2K

judgeval

The Continuous-Improvement Stack for Agents. Our environment data and evals power agent improvement and monitoring.

147K 1K 93

agenta

The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM observability all in one place.

41K 4K 559

opik-optimizer

Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.

31K 20K 2K

latitude-sdk

Latitude is the open-source ai monitoring platform.

29K 4K 351

latitude-telemetry

Latitude is the open-source ai monitoring platform.

12K 4K 351

helicone-helpers

🧊 Open source LLM observability platform. One line of code to monitor, evaluate, and experiment. YC W23 🍓

8K 6K 618

acontext

Agent Skills as a Memory Layer

5K 4K 323

genai-otel-instrument

GenAI OpenTelemetry Auto-Instrumentation Library A comprehensive wrapper for automatic instrumentation of LLM/GenAI applications Supports all major LLM providers and MCP (Model Context Protocol) tool calls

5K 2 1

dunetrace

Real-time monitoring of production AI agents. No raw content transmitted.

5K 56 12

peekr

Zero-config observability for AI agents. Auto-instruments OpenAI & Anthropic SDKs.

3K 3 0

helicone-async

🧊 Open source LLM observability platform. One line of code to monitor, evaluate, and experiment. YC W23 🍓

3K 6K 620

mcpeye

mcpeye Python SDK — open-source product analytics for MCP servers. See why your agent is failing.

2K 3 0

comet-llm

Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.

2K 20K 2K

shyftlabs-continuum

Continuum — the agent runtime by ShyftLabs. Build, orchestrate, ship.

2K 75 8

agent-panorama

See what your AI agents do, whether it's worth it, and what it costs - a manager-readable report + local dashboard from Langfuse/LangSmith traces (or a one-line live callback). Open source, runs locally.

2K 5 0

agentmetrics

Open-source AI agent observability. Track cost, latency, tokens, and errors for OpenClaw, Hermes, LangChain, CrewAI, LlamaIndex, OpenAI Agents, AutoGen, and Anthropic Managed Agents. Self-hosted, no cloud required.

2K 3 0

forgesight-api

Vendor-neutral, OpenTelemetry-first telemetry for AI agents — traces, cost, budgets & a tamper-evident audit trail to any backend, no agent-code changes.

1K 4 0

forgesight-core

Vendor-neutral, OpenTelemetry-first telemetry for AI agents — traces, cost, budgets & a tamper-evident audit trail to any backend, no agent-code changes.

1K 4 0

helicone

🧊 Open source LLM observability platform. One line of code to monitor, evaluate, and experiment. YC W23 🍓

1K 6K 620

agentmetrics-hermes

1K 3 0

longtrainer

Production-ready RAG framework for Python — multi-tenant chatbots with streaming, tool calling, agent mode (LangGraph), vector search (FAISS), and persistent MongoDB memory. Built on LangChain.

1K 30 2

spanlens

Spanlens SDK for Python. Agent tracing, LLM usage capture, and cost observability.

1K 9 0