multi-agent-eval
An implementation of the Anthropic's paper and essay on "A statistical approach to model evaluations"