Llm Safety Python Packages

nemoguardrails

NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.

351K 7K 747

deepteam

DeepTeam is a framework to red team LLMs and AI agents.

82K 2K 319

agent-airlock

A deny-by-default contract & type-checker layer for AI agent tool calls — Pydantic-based, in-process, zero-core-deps. Validates the actual tool-call payload (ghost-arg stripping, strict types, self-healing retries) beneath MCP gateways & firewalls. Works with LangChain, OpenAI Agents SDK, PydanticAI & CrewAI.

17K 10 3

styxx

The measurement layer for machine minds. Reads what a model means and whether it holds the truth; certifies every claim re-runs. meaning_diff + OATH certify + mind profiles + live grounding signal + the cognometric instruments. No torch, no LLM in the loop for the core; MIT, open at the core.

13K 13 1

agentguard-governance

Governance layer for autonomous AI agents — pre-flight checks, runtime monitoring, and post-session reporting.

12K 1 0

agent-audit

Static security scanner for LLM agents — prompt injection, MCP config auditing, taint analysis. 51 rules mapped to OWASP Agentic Top 10 (2026). Works with LangChain, CrewAI, AutoGen.

8K 192 22

uqlm

UQLM: Uncertainty Quantification for Language Models, is a Python package for UQ-based LLM hallucination detection

7K 1K 127

neurosym-ai

Neuro-symbolic guardrails for LLMs — injection detection, harm filters, output guards, streaming safety, and action-plan validation.

2K 2 0

qwed

A deterministic verification layer for AI systems. QWED verifies AI outputs using mathematics, symbolic reasoning, and formal methods (Z3, SMT, SymPy), creating an auditable trust boundary for agentic AI. Not generation. Verification.

2K 58 10

yamtam-engine

Personal Agent OS for Claude Code — 46 hooks, 1980 skills, 101 agents. Blocks rm -rf, prompt injection, pipe-to-shell at runtime. Apache 2.0.

1K 1 0

qwed-finance

Deterministic verification middleware for banking and financial AI. NPV, IRR, loan amortization, and interest calculations with QWED precision.

1K 4 2

lintlang

lintlang is a static linter for AI agent configs, tool descriptions, and system prompts that runs zero-LLM quality gating in CI. Catches language-level failures (vague tool descriptions, missing stop conditions, schema gaps) before they reach runtime, with deterministic regex + structural detectors and no model calls.

1K 48 2

geh

One command to benchmark AI guardrails and coding agents across safety, security, jailbreak, prompt-injection, and secure-code tasks.

999 23 3

kitelogik

Governance control plane for AI agents — policy enforcement at the tool-call, spawn, delegate, plan, and budget layer. OPA-backed, Apache-2.0.

820 3 2

fivedrisk

Per-action AI agent risk scoring and governance. Deterministic 5D scoring, HITL gating, FinOps, Agent Cost Management, Markov drift, audit log. Apache-2.0.

659 0 0

gateguard-ai

A fact-forcing hook gate for Claude Code. Makes the AI pause and investigate before editing.

641 7 0

brix-protocol

Runtime Reliability Infrastructure for LLM Pipelines — enforce deterministic rules, measure the Balance Index, and audit every decision.

580 8 0

csl-core

Deterministic policy language for AI agents. Z3 + TLA+ dual-engine formal verification. Runtime enforcement <1ms.

532 15 12

aetherlab

Official Python SDK for AetherLab - AI guardrails, LLM safety, and content moderation API

480 1 0

pytest-wardenbot

Pytest plugin for testing chatbots and LLM apps — prompt injection, jailbreaks, system-prompt leaks, hallucinations, brand drift.

463 0 0

qwed-tax

The Verification Gate for AI-Generated Tax Decisions. Deterministic tax verification layer powered by Z3 and Decimal math — sits between AI agents and execution systems. Not a calculator, not a filing platform.

458 1 1

blackwall-llm-shield-python

Blackwall LLM Shield is an open-source AI security toolkit for JavaScript and Python that protects LLM apps from prompt injection, sensitive data leaks, unsafe tool calls, and hostile RAG content with prompt sanitization, PII masking, output inspection, policy enforcement, and audit trails.

456 1 0

qwed-infra

Deterministic verification layer for infrastructure as code. Verifies AWS IAM policies, network reachability, and cost estimates before deployment — sits between AI agents and cloud execution. Powered by Z3 and NetworkX.

422 1 1

agi-pragma

AI Action Firewall — seven-stage Decision Intelligence Core for safe agentic AI

376 0 0