ml-evaluation
A single-file Python CLI that pre-registers AI/ML accuracy claims with SHA-256. Lock the threshold before the data, or it didn't happen.
A metrics library to evaluate vision language models with a pytorch eco system.