deepeval
中文大模型通用SDK,系统性优化接口适配、增强响应解析和批量处理等能力,深度适配 OpenAI 生态内 LangChain、LlamaIndex、AutoGen 等大模型应用框架。支持作为Agent Skill部署到各种AI编程工具。
Automatically discover where and why your LLM is failing — embedding-space clustering + statistical hypothesis testing to surface input slices with elevated failure rates and audit test suite coverage gaps.