Evaluations Python Packages

railtracks

An agentic framework that helps developers build resilient agentic systems

4K 138 15

ecp-runtime

ECP is a standardized interface for orchestrating, auditing, and enforcing authority limits in AI Agent evaluations. It moves evaluation from "brittle Python scripts" to a deterministic infrastructure protocol

2K 8 1

ecp-sdk

2K 8 1

evaluations

This library implements various metrics (including Kaggle Competition, Medicine) for evaluating ML, DL, AI models, and algorithms. 📐📊📈📉📏

2K 14 1

log10-io

Unified LLM data management

1K 97 12

apolien

AI Safety Evaluation Library

1K 5 1

railtracks-cli

An agentic framework that helps developers build resilient agentic systems

868 138 15

evret

Evals framework for Information Retrieval Systems

745 18 3

mandoline

Official Python client for the Mandoline API

441 2 0

aniket-agentlens-sdk

Python tracing SDK for AgentLens, an open-source observability platform for AI agents.

197 0 0