researchactive

How to Evaluate Control Measures for LLM Agents? A Trajectory from Today to Superintelligence

Korbak et al. (UK AISI) propose a methodology for evaluating AI-control measures against increasingly capable LLM agents, using red-team protocols and capability elicitation. Introduces a trajectory from current models to hypothetical superintelligent agents.

At a glance

Published

2025

Jurisdiction

United Kingdom

More in Evaluation and benchmarks

tau-bench: A benchmark for tool-agent-user interaction

Sierra Research • 2024

SWE-bench: Can Language Models Resolve Real-World GitHub Issues?

Jimenez et al. (Princeton) • 2023

WebArena: A Realistic Web Environment for Building Autonomous Agents

Zhou et al. • 2023

Related resources

Practices for governing agentic AI systems: OpenAI's seven safety principles

Governance frameworks • OpenAI

Taxonomy of Failure Mode in Agentic AI Systems

Risk taxonomies • Microsoft

EleutherAI LM Evaluation Harness

Assessment and evaluation • EleutherAI

Build your AI governance program

VerifyWise helps you implement AI governance frameworks, track compliance, and manage risk across your AI systems.

Explore the library Start free trial

How to Evaluate Control Measures for LLM Agents? A Trajectory from Today to Superintelligence

Tags

At a glance

More in Evaluation and benchmarks

Related resources

Build your AI governance program