Back to AI governance templates
Data and Security AI Policies

Prompt Security and Prompt Hardening Policy

Guidance to stop prompt injection and sanitize prompts.

Owner: Application Security Lead

Purpose

Prevent prompt-based attacks (injection, data exfiltration, jailbreaks) by defining standards for prompt design, sanitization, and runtime guardrails across conversational and generative AI applications.

Scope

Applies to all prompt-driven interfaces exposed to employees, partners, or customers, as well as internal agent frameworks that rely on prompts to orchestrate actions.

  • Customer support chatbots and knowledge assistants
  • Internal copilots and code-generation tools
  • Agent frameworks interacting with external APIs or tools
  • Shared prompt templates and prompt libraries

Definitions

  • Prompt Injection: Attack where user input manipulates model instructions to disclose secrets or perform unintended actions.
  • System Prompt: Non-user visible instruction set controlling model behaviour.
  • Guardrail Prompt: Supplemental prompt that enforces safety boundaries or refusal logic.

Policy

All prompts must pass a security review prior to deployment. User inputs must be sanitized, sensitive instructions must be isolated, and guardrail prompts must be applied for every interaction. Runtime monitoring must detect and block malicious prompt activity.

Roles and Responsibilities

Application Security Lead curates prompt security standards and approves reviews. Engineering implements sanitization libraries and integrates guardrail services. Responsible AI defines behavioural boundaries and escalation criteria. Security Operations monitors alerts and coordinates incident response when violations occur.

Procedures

Prompt hardening must include:

  • Prompt design review documenting goals, constraints, and disallowed behaviours.
  • Input sanitation pipeline removing embedded instructions, HTML/Markdown exploits, and sensitive data patterns.
  • Isolation of system prompts and secrets in secure storage instead of embedding them in user-visible prompts.
  • Automated red teaming against prompt injection, jailbreaks, and context-hijacking scenarios.
  • Runtime guardrails that inspect inputs/outputs and enforce refusal or rollback when violations are detected.
  • Audit logging of prompt interactions for forensic review.

Exceptions

Prototype prompts may run with reduced guardrails inside sandbox environments only. Production rollout requires full control coverage.

Review Cadence

Prompt libraries undergo quarterly reviews to remove obsolete prompts, incorporate new intelligence, and verify guardrail effectiveness.

References

  • OWASP LLM Top 10 (Prompt Injection)
  • NIST AI RMF Govern/Manage functions
  • Internal documents: Prompt Hardening Guide, Guardrail Service Runbook, Secure Coding Standard

Ready to implement this policy?

Use VerifyWise to customize, deploy, and track compliance with this policy template.

Prompt Security and Prompt Hardening Policy | VerifyWise AI Governance Templates