Dylan Hadfield-Menell, UC Berkeley
researchactive

The Principal-Agent Alignment Problem in AI

Dylan Hadfield-Menell, UC Berkeley

View original resource

Hadfield-Menell's Berkeley technical report formalises AI alignment as a principal-agent problem with incomplete contracts, drawing on mechanism design. Introduces inverse reward design and cooperative inverse reinforcement learning as alignment approaches.

Tags

agentic AIgovernance-frameworks

At a glance

Published

2021

Jurisdiction

United States

Category

Governance frameworks

Access

Public access

Build your AI governance program

VerifyWise helps you implement AI governance frameworks, track compliance, and manage risk across your AI systems.

The Principal-Agent Alignment Problem in AI | VerifyWise AI Governance Library