researchactive
AgentRx: Diagnosing AI Agent Failures from Execution Trajectories
View original resourceIntroduces the AGENTRY benchmark for attributing the first unrecoverable failure in AI agents across API workflows, incident management, and real-world web/file tasks, supporting trajectory-based diagnosis of why agents fail.
Tags
agentic AIevaluationfailure detectiontrajectories
At a glance
Published
2026
Jurisdiction
Global
Category
Evaluation and benchmarks
Access
Public access
Build your AI governance program
VerifyWise helps you implement AI governance frameworks, track compliance, and manage risk across your AI systems.