Zhou et al.
datasetactive

WebArena: A Realistic Web Environment for Building Autonomous Agents

Zhou et al.

View original resource

Zhou et al. introduce a self-hosted web environment covering e-commerce, forums, software development, and CMS apps, with 812 natural-language tasks. Evaluates end-to-end browsing agents on realistic multi-step workflows with verifiable outcomes.

Tags

agentic AIevaluation

At a glance

Published

2023

Jurisdiction

International

Category

Evaluation and benchmarks

Access

Public access

Build your AI governance program

VerifyWise helps you implement AI governance frameworks, track compliance, and manage risk across your AI systems.

WebArena: A Realistic Web Environment for Building Autonomous Agents | VerifyWise AI Governance Library