Trustworthy Ai Python Packages

giskard

🐢 Open-Source Evaluation & Testing library for LLM Agents

32K 5K 479

adversarial-robustness-toolbox

Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams

26K 6K 1K

langtest

Deliver safe & effective language models

3K 562 50

helix-adapter

Portable constitutional adapter for any LLM. Forces epistemic markers ([FACT], [REASONED], etc.), real-time drift detection, and tamper-evident receipts. Model-agnostic wrapper.

3K 1 0

nlptest

Deliver safe & effective language models

3K 562 50

akios

Secure runtime for multi-agent AI. Kernel sandboxing (seccomp-bpf), real-time PII redaction, Merkle audit trails.

2K 8 3

rhesis-sdk

The testing platform for AI teams. Bring engineers, PMs, and domain experts together to generate tests, simulate (adversarial) conversations, and trace every failure to its root cause.

2K 375 26

detectors

Python package to accelerate research on generalized out-of-distribution (OOD) detection.

2K 15 1

aiverify-moonshot

Moonshot - A simple and modular tool to evaluate and red-team any LLM application.

2K 334 65

rhesis

The testing platform for AI teams. Bring engineers, PMs, and domain experts together to generate tests, simulate (adversarial) conversations, and trace every failure to its root cause.

2K 375 26

dqm-ml-images

A library to compute data quality metrics

2K 5 3

encypher-ai

Metadata encoding and extraction for AI-generated content

1K 31 3

enforcecore

Lightweight runtime enforcement for agentic AI. PII masking, policy checks, and Merkle audit trails as a decorator.

1K 6 3

auditable

Audit any agent decision across its past, present, and future, on one typed graph.

1K 13 0

groundguard

Verify LLM output against your source documents. Catch hallucinations in RAG pipelines and agentic workflows before they reach users.

1K 0 0

dqm-ml-core

A library to compute data quality metrics

1K 5 3

dqm-ml-pipeline

A library to compute data quality metrics

1K 5 3

trustifai

TrustifAI: A Comprehensive Framework for AI Trustworthiness

931 12 2

dqm-ml

A library to compute data quality metrics

927 5 3

trustlens

Open-source Python library for evaluating ML model reliability beyond accuracy — with calibration, failure, and fairness diagnostics for informed deployment decisions.

829 12 20

dqm-ml-pytorch

A library to compute data quality metrics

825 5 3

proof-engine-wiki

AI agent skill that creates formal, verifiable proofs of claims — every fact computed or cited, never asserted

733 7 1

proofofthought

Proof of thought : LLM-based reasoning using Z3 theorem proving with multiple backend support (SMT2 and JSON DSL)

688 375 25

pydetectgpt

Easy to use Python library for detecting AI-generated text

665 2 0