Reliability Testing Python Packages

falsifyai

Portable, content-addressed reliability evidence for LLM systems. Capture how a model behaves under perturbation; preserve, verify, and diff the evidence across model changes.

3K 0 0

ai-stability

CLI-first LLM stability analyzer for measuring output consistency across repeated prompt runs.

421 0 0

langchain-chaos-middleware

A middleware for LangChain agents that intentionally injects failures (exceptions) into tool and model calls to test agent resilience.

413 2 0