PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Llm Evaluation Toolkit Python Packages

Python packages with the GitHub topic llm-evaluation-toolkit. Sorted by relevance, with stars and monthly downloads.
zli12321
qa-metrics

An easy python package to run quick basic QA evaluations. This package includes standardized QA evaluation metrics and semantic evaluation metrics: Black-box and Open-Source large language model prompting and evaluation, exact match, F1 Score, PEDANT semantic match, transformer match. Our package also supports prompting OPENAI and Anthropic API.

8K 61 6
parea-ai
parea-ai

Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)

3K 82 11
Pacific-AI-Corp
langtest

Deliver safe & effective language models

2K 557 49
JohnSnowLabs
nlptest

Deliver safe & effective language models

2K 557 49
scalexi
scalexi

The scalexi package is a versatile open-source Python library that focuses on facilitating low-code development and fine-tuning of diverse Large Language Models (LLMs). It extends beyond its initial OpenAI models integration, offering a scalable framework for various LLMs.

1K 13 2
nhsengland
evalsense

Tools for systematic large language model evaluations

442 4 2
    • Data from PyPI, GitHub, ClickHouse, and BigQuery