User guide AI governanceAI incident management

AI governance

AI incident management

Document, track, and resolve AI-related incidents effectively.

Overview

AI incident management is the practice of detecting, documenting, responding to, and learning from problems that occur with AI systems. Unlike traditional software bugs, AI incidents can be subtle — a model may produce biased outputs, make incorrect predictions, or behave unexpectedly in edge cases without triggering obvious errors.

Effective incident management requires both reactive capabilities (handling problems when they occur) and proactive practices (learning from incidents to prevent recurrence). Organizations that manage AI incidents well can respond quickly to minimize harm, satisfy regulatory requirements, and continuously improve their AI systems.

Why manage AI incidents?

Minimize harm: Quick response limits the impact of AI failures on users and stakeholders
Regulatory compliance: Regulations require incident documentation and may mandate reporting serious incidents
Continuous improvement: Incident analysis reveals weaknesses and drives improvements in AI systems
Stakeholder trust: Transparent incident handling demonstrates responsible AI governance
Knowledge retention: Documented incidents preserve institutional knowledge for future reference

Regulatory requirement

Under the EU AI Act, providers and deployers of high-risk AI systems must report serious incidents to relevant authorities. Maintaining thorough incident records is essential for compliance.

Incident management table showing columns for incident ID, AI use case, type, severity, status, occurred date, reporter, and approval status — The incident list provides an overview of all reported AI incidents with filtering and sorting options.

Creating an incident

Navigate to Incident management from the sidebar and click New incident. You'll need to provide:

AI project: — Select the project or system involved in the incident
Model/system version: — Specify which version of the model was affected
Description: — Provide a clear description of what occurred
Reporter: — Who is reporting the incident
Date occurred: — When the incident actually happened
Date detected: — When the incident was discovered (may differ from occurrence)

Create new incident modal showing fields for incident information, impact assessment, categories of harm, affected persons, description, and response actions — The incident creation form captures details needed for investigation and regulatory reporting.

Incident types

VerifyWise categorizes incidents to help with analysis and reporting:

Malfunction

The AI system failed to operate as designed or produced errors.

Unexpected behavior

The system behaved in ways not anticipated during development or testing.

Model drift

Model performance degraded over time due to changes in input data patterns.

Misuse

The AI system was used in ways outside its intended purpose.

Data corruption

Issues with training data, input data, or data pipelines affected the system.

Security breach

Unauthorized access, adversarial attacks, or security vulnerabilities were exploited.

Performance degradation

Significant decline in accuracy, latency, or other performance metrics.

Severity levels

Classify incidents by severity to prioritize response efforts:

Severity	Description	Example
Minor	Limited impact, no harm caused	Occasional incorrect predictions with no downstream effect
Serious	Significant impact on operations or users	Systematic bias affecting a group of users
Very serious	Potential or actual harm to individuals	Safety-critical system failure, data breach

Incident workflow

Incidents progress through defined statuses as they are investigated and resolved:

Open: Incident has been reported and is awaiting investigation
Investigating: Team is actively analyzing the root cause
Mitigated: Immediate actions have been taken to address the issue
Closed: Incident has been fully resolved and documented

Documenting mitigation actions

For each incident, record both immediate and long-term responses:

Immediate mitigations

Actions taken to stop ongoing harm (rollback, disable feature, manual override)

Corrective actions

Planned fixes to prevent recurrence (model retraining, process changes, monitoring)

Approval workflow

Serious incidents may require approval before being closed. The approval workflow tracks:

Approval status (Pending, Approved, Rejected, Not required)
Who approved the incident closure
Approval date and timestamp
Approval notes and conditions

Affected parties

Document which individuals or groups were affected by the incident. This information is important for:

Regulatory reporting requirements
Communication and notification obligations
Impact assessment and remediation planning
Lessons learned and prevention strategies

Interim reports

For ongoing incidents, you can mark that an interim report has been filed. This is particularly relevant for regulatory compliance where initial notifications must be submitted within specific timeframes.

Archiving incidents

Closed incidents can be archived to keep your active incident list manageable while maintaining complete records for audit purposes. Archived incidents remain searchable and can be restored if needed.

Managing model inventory

Track the models involved in incidents

Conducting risk assessments

Identify risks that could lead to incidents

EU AI Act compliance

Understand incident reporting requirements

PreviousTask management

NextEvidence collection