PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Hellaswag Python Packages

Python packages with the GitHub topic hellaswag. Sorted by relevance, with stars and monthly downloads.
NahuelGiudizi
llm-benchmark-toolkit

Enterprise-grade LLM evaluation framework | Multi-model benchmarking, honest dashboards, system profiling | Academic metrics: MMLU, TruthfulQA, HellaSwag | Zero fake data | PyPI: llm-benchmark-toolkit | Blog: https://dev.to/nahuelgiudizi/building-an-honest-llm-evaluation-framework-from-fake-metrics-to-real-benchmarks-2b90

796 2 1
    • Data from PyPI, GitHub, ClickHouse, and BigQuery