PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Llms Benchmarking Python Packages

Python packages with the GitHub topic llms-benchmarking. Sorted by relevance, with stars and monthly downloads.
parea-ai
parea-ai

Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)

3K 82 11
multinear
multinear

Develop reliable AI apps

401 45 1
matus-pikuliak
genderbench

Evaluation suite for gender biases in LLMs.

290 5 1
melvinebenezer
liah

Insert a Lie in a Haystack and evaluate the model's ability to detect it.

72 2 0
matus-pikuliak
gender-bench

GenderBench - Evaluation suite for gender biases in LLMs

2 5 1
    • Data from PyPI, GitHub, ClickHouse, and BigQuery