The AI Incident Database stands as the world's most comprehensive repository of documented AI system failures, containing over 1,200 real-world cases where AI has caused harm. Launched by the Partnership on AI in 2020, this living database transforms scattered incident reports into a searchable, categorized resource that reveals patterns in AI failures across industries. From biased hiring algorithms to autonomous vehicle crashes, each entry provides detailed context about what went wrong, why it happened, and what lessons can be learned. It's essentially the "NTSB database" for AI incidents—turning individual failures into collective wisdom for safer AI deployment.
Unlike scattered news reports or academic papers about AI failures, the AI Incident Database applies rigorous incident classification systems borrowed from aviation and nuclear safety. Each incident receives structured tagging across multiple dimensions: harm type (physical, economic, social), affected populations, AI system characteristics, and contributing factors. The database doesn't just collect incidents—it analyzes patterns, enabling users to identify common failure modes like algorithmic bias in facial recognition or edge case failures in computer vision systems.
The database also maintains a living taxonomy that evolves as new types of AI incidents emerge. Early entries focused heavily on discrimination and privacy violations, but recent additions increasingly document issues with generative AI, deepfakes, and large language model hallucinations.
AI safety researchers conducting empirical studies on failure modes and developing safety metrics based on historical patterns
Product managers and engineers building AI systems who need to anticipate potential failure modes during design and testing phases
Risk management professionals in organizations deploying AI who must assess liability exposure and develop incident response protocols
Regulators and policymakers drafting AI governance frameworks who need evidence-based understanding of where AI systems commonly fail
Insurance companies developing AI-related coverage policies and setting premiums based on historical loss data
Academic researchers studying AI ethics, fairness, and safety who need comprehensive case study materials
The database uses a multi-layered taxonomy system that categorizes incidents across several key dimensions:
Harm severity: Ranges from minor inconveniences to life-threatening situations, with clear criteria for each level
System type: Computer vision, natural language processing, recommendation systems, autonomous vehicles, and dozens of other AI application areas
Failure mode: Whether the incident stemmed from training data bias, adversarial inputs, distribution shift, hardware failures, or human operator errors
Affected groups: Demographic breakdowns showing which populations experienced harm, revealing patterns of disparate impact
Industry context: Healthcare, criminal justice, employment, finance, and other sectors where AI deployment carries different risk profiles
Each incident also receives temporal tagging, allowing users to track how AI failure patterns have evolved as technology and deployment practices have changed.
Start with the database's pre-built queries for common use cases rather than browsing randomly through 1,200+ incidents. The "Similar Incidents" feature helps identify clusters of related failures, while the timeline view reveals whether certain types of incidents are becoming more or less common.
For risk assessment purposes, filter incidents by your specific AI application area and harm severity levels. The database's citation system links back to original sources, making it valuable for due diligence documentation.
The monthly incident summaries provide digestible overviews of newly added cases and emerging patterns, making it easier to stay current without monitoring the full database continuously.
Advanced users can export structured data for quantitative analysis, though be aware that incident reporting rates vary significantly across industries and geographic regions, potentially skewing apparent risk distributions.
The database suffers from significant reporting bias—incidents in regulated industries like aviation get documented more thoroughly than failures in consumer applications. Western English-language incidents are overrepresented compared to global AI deployments.
Many incidents lack technical depth about root causes, focusing more on observable harms than underlying system architecture or training methodology failures. The database also struggles with incidents involving proprietary systems where companies limit information disclosure.
The classification system, while comprehensive, continues evolving as new AI capabilities create novel failure modes not anticipated in the original taxonomy. Users should expect some inconsistency in how similar incidents from different time periods are categorized.
Published
2020
Jurisdiction
Global
Category
Incident and accountability
Access
Public access
VerifyWise helps you implement AI governance frameworks, track compliance, and manage risk across your AI systems.