Risks and challenges

Threat landscape for AI agents: prompt injection, data protection, misuse, and systemic risks.

14 resources

Type:

14 resources found

reportUK Information Commissioner's Office • 2026

ICO tech futures: Agentic AI

UK ICO tech-futures analysis of how agentic AI interacts with UK GDPR, covering lawful basis for agent-initiated processing, data minimisation across tool calls, transparency duties, and accountability when agents act on behalf of data subjects.

Data protectionUnited Kingdom

reportYoshua Bengio et al., UK DSIT • 2026

International AI Safety Report 2026

Independent expert panel report chaired by Yoshua Bengio for UK DSIT, synthesising evidence on general-purpose AI capabilities, risks, and mitigations. 2026 edition expands coverage of agentic systems, loss-of-control scenarios, and emerging misuse patterns.

Systemic and safety risksInternational

researchSchmotz, Abdelnabi, Andriushchenko • 2025

Agent Skills Enable a New Class of Realistic and Trivially Simple Prompt Injections

Research showing that Claude's Skills feature, which auto-loads Markdown instructions from the filesystem, enables trivial prompt injection via a single malicious file. Demonstrates data exfiltration and privilege escalation across common agent deployments.

Prompt injection and jailbreaksInternational

guidelineOWASP Gen AI Security Project • 2025

OWASP Top 10 for Agentic Applications for 2026

OWASP Gen AI Security Project's top-ten list of agentic application risks for 2026, covering memory poisoning, tool misuse, privilege compromise, intent breaking, goal manipulation, and identity spoofing. Includes example attacks and suggested controls per risk.

Prompt injection and jailbreaksInternational

guidelineOWASP • 2025

OWASP GenAI Security Project: Top 10 Risks and Mitigations for Agentic AI Security

OWASP reference mapping the top agentic AI threats to concrete technical and procedural mitigations, organised by attack surface (planning, memory, tools, outputs). Aimed at defenders building secure agent stacks rather than researchers cataloguing attacks.

Prompt injection and jailbreaksInternational

researchOliver Patel, Enterprise AI Governance • 2025

Initial reflections on agentic AI governance

Oliver Patel's practitioner essay flagging how agents break assumptions in enterprise AI governance: autonomous tool calls, emergent multi-agent behaviour, and diffuse accountability. Suggests extensions to risk registers, oversight roles, and policy controls.

Systemic and safety risksGlobal

researchPeiran Wang et al. • 2026

The Landscape of Prompt Injection Threats in LLM Agents: From Taxonomy to Analysis

Wang et al. survey proposing a taxonomy of prompt injection threats specific to LLM agents, distinguishing direct, indirect, and tool-mediated vectors. Analyses defences (sandboxing, detection, constrained decoding) against reported attack success rates.

Prompt injection and jailbreaksInternational

researchIEEE Spectrum (Matthew Hutson) • 2025

AI Agents Break Rules Under Everyday Pressure

IEEE Spectrum article covering research showing agents violate assigned constraints under everyday pressures like deadlines or user insistence. Summarises findings from multiple benchmark studies and discusses implications for deployment in regulated settings.

Misuse and abuseInternational

researchNatalie Shapira et al. • 2026

Agents of Chaos

Shapira et al. document emergent failure modes in multi-agent LLM deployments, including cascading hallucinations, role drift, and collusion. Propose experimental setups to reproduce chaotic behaviour and measure its dependence on agent count and coupling.

Misuse and abuseInternational

policyAutoriteit Persoonsgegevens (Dutch DPA) • 2026

AP warns of major security risks with AI agents like OpenClaw

Dutch Data Protection Authority warning on security and privacy risks of agent platforms like OpenClaw, flagging unscoped data access, weak logging, and inability to honour data subject rights when agents act across multiple systems.

Data protectionNetherlands

frameworkCloud Security Alliance • 2025

Agentic AI Threat Modeling Framework: MAESTRO

Cloud Security Alliance's MAESTRO threat-modelling methodology for multi-agent and agentic systems, extending STRIDE-style analysis across seven architectural layers (foundation model, data, deployment, observability, security, compliance, agent ecosystem) with example threats and controls.

Systemic and safety risksGlobal

reportCLTC, UC Berkeley • 2026

Managing Risks of Agentic AI

UC Berkeley CLTC report setting out a risk-management approach for increasingly autonomous AI agents, covering risk identification across the lifecycle, oversight mechanisms, and organisational roles. Aimed at enterprise and public-sector deployers.

Systemic and safety risksUnited States

researchMargaret Mitchell et al., Hugging Face • 2025

Fully Autonomous AI Agents Should Not be Developed

Mitchell et al. (Hugging Face) argue against developing fully autonomous AI agents, mapping a spectrum from human-in-the-loop assistants to unsupervised actors. Enumerates safety, ethical, and accountability risks that grow sharply at each autonomy level.

Systemic and safety risksInternational

frameworkMITRE • 2025

MITRE ATLAS: adversary tactics and techniques against AI

MITRE ATLAS knowledge base of adversary tactics, techniques, and case studies targeting machine-learning systems, including agent-specific scenarios like prompt injection, tool abuse, and model-in-the-loop manipulation. Structured in ATT&CK-compatible format for defenders.

Prompt injection and jailbreaksUnited States