PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Llm Benchmarking Python Packages

Python packages with the GitHub topic llm-benchmarking. Sorted by relevance, with stars and monthly downloads.
rangersui
elastik

curl is all you need. Bytes over HTTP, packets over CoAP, one path between intelligences. Ships with CurlBench, the first LLM tool-use benchmark graded by HTTP status codes

7K 30 1
Pro-GenAI
agent-action-guard

🛡️ Safe AI Agents through Action Classifier

3K 10 7
nhsengland
evalsense

Tools for systematic large language model evaluations

429 4 2
antrixsh
trusteval-ai

Enterprise LLM Evaluation & Responsible AI Framework — Benchmark bias, hallucination, PII leakage, and toxicity across Healthcare, BFSI, Retail & Legal industries. Supports OpenAI, Anthropic, Gemini & HuggingFace. Python SDK + CLI + Web Dashboard. 191 tests. Compliance-ready reports.

266 7 5
    • Data from PyPI, GitHub, ClickHouse, and BigQuery