Partnership on AI
datasetactive

AI Incident Database

Partnership on AI

View original resource

AI Incident Database

Summary

The AI Incident Database stands as the world's most comprehensive repository of documented AI system failures, containing over 1,200 real-world cases where AI has caused harm. Launched by the Partnership on AI in 2020, this living database transforms scattered incident reports into a searchable, categorized resource that reveals patterns in AI failures across industries. From biased hiring algorithms to autonomous vehicle crashes, each entry provides detailed context about what went wrong, why it happened, and what lessons can be learned. It's essentially the "NTSB database" for AI incidents—turning individual failures into collective wisdom for safer AI deployment.

What makes this database unique

Unlike scattered news reports or academic papers about AI failures, the AI Incident Database applies rigorous incident classification systems borrowed from aviation and nuclear safety. Each incident receives structured tagging across multiple dimensions: harm type (physical, economic, social), affected populations, AI system characteristics, and contributing factors. The database doesn't just collect incidents—it analyzes patterns, enabling users to identify common failure modes like algorithmic bias in facial recognition or edge case failures in computer vision systems.

The database also maintains a living taxonomy that evolves as new types of AI incidents emerge. Early entries focused heavily on discrimination and privacy violations, but recent additions increasingly document issues with generative AI, deepfakes, and large language model hallucinations.

Who this resource is for

AI safety researchers conducting empirical studies on failure modes and developing safety metrics based on historical patterns

Product managers and engineers building AI systems who need to anticipate potential failure modes during design and testing phases

Risk management professionals in organizations deploying AI who must assess liability exposure and develop incident response protocols

Regulators and policymakers drafting AI governance frameworks who need evidence-based understanding of where AI systems commonly fail

Insurance companies developing AI-related coverage policies and setting premiums based on historical loss data

Academic researchers studying AI ethics, fairness, and safety who need comprehensive case study materials

How incidents get classified

The database uses a multi-layered taxonomy system that categorizes incidents across several key dimensions:

Harm severity: Ranges from minor inconveniences to life-threatening situations, with clear criteria for each level

System type: Computer vision, natural language processing, recommendation systems, autonomous vehicles, and dozens of other AI application areas

Failure mode: Whether the incident stemmed from training data bias, adversarial inputs, distribution shift, hardware failures, or human operator errors

Affected groups: Demographic breakdowns showing which populations experienced harm, revealing patterns of disparate impact

Industry context: Healthcare, criminal justice, employment, finance, and other sectors where AI deployment carries different risk profiles

Each incident also receives temporal tagging, allowing users to track how AI failure patterns have evolved as technology and deployment practices have changed.

Getting the most from the data

Start with the database's pre-built queries for common use cases rather than browsing randomly through 1,200+ incidents. The "Similar Incidents" feature helps identify clusters of related failures, while the timeline view reveals whether certain types of incidents are becoming more or less common.

For risk assessment purposes, filter incidents by your specific AI application area and harm severity levels. The database's citation system links back to original sources, making it valuable for due diligence documentation.

The monthly incident summaries provide digestible overviews of newly added cases and emerging patterns, making it easier to stay current without monitoring the full database continuously.

Advanced users can export structured data for quantitative analysis, though be aware that incident reporting rates vary significantly across industries and geographic regions, potentially skewing apparent risk distributions.

Current limitations and blind spots

The database suffers from significant reporting bias—incidents in regulated industries like aviation get documented more thoroughly than failures in consumer applications. Western English-language incidents are overrepresented compared to global AI deployments.

Many incidents lack technical depth about root causes, focusing more on observable harms than underlying system architecture or training methodology failures. The database also struggles with incidents involving proprietary systems where companies limit information disclosure.

The classification system, while comprehensive, continues evolving as new AI capabilities create novel failure modes not anticipated in the original taxonomy. Users should expect some inconsistency in how similar incidents from different time periods are categorized.

Tags

incident reportingAI safetyrisk managementaccountabilityharm documentationcase studies

At a glance

Published

2020

Jurisdiction

Global

Category

Incident and accountability

Access

Public access

Build your AI governance program

VerifyWise helps you implement AI governance frameworks, track compliance, and manage risk across your AI systems.

AI Incident Database | AI Governance Library | VerifyWise