Quick Start
openclaw skills install abel-agent-evaluationbyabeltennyson · ai-agents
"Testing and benchmarking LLM agents including behavioral testing, capability assessment, reliability metrics, and production monitoring—where even top agents achieve less than 50% on real-world benchmarks Use when: agent testing, agent evaluation, benchmark agents, agent reliability, test agent."
openclaw skills install abel-agent-evaluation Or ask OpenClaw: "Install the agent-evaluation skill"
openclaw skills install abel-agent-evaluationInstall and run agent-evaluation instantly — no setup required.