Reusable QA harness for testing LLM agents and personas. Defines test suites with must-ace tasks, refusal edge cases, scoring rubrics, and regression protocols. Use when validating agent behavior, tes
选择开发工具安装