AI Gateway

MCP guardrails

Configure PII detection, content filtering, and prompt injection rules for MCP tool inputs.

Overview

MCP Guardrails scan tool inputs before the gateway forwards them to the backend server. They catch PII, prohibited content and prompt injection attempts before they reach your tools.

You'll find them at AI Gateway > Agent Control > Guardrails.

These guardrails are separate from the AI Gateway's LLM guardrails. LLM guardrails scan chat completion requests to LLM providers, while MCP guardrails scan the tool invocation inputs agents send to MCP servers.

Rule list

The main view lists all configured guardrail rules. A summary line at the top shows the total rule count and how many are active (e.g., "3 rules configured, 2 active").

Each rule row displays:

Name: The name you gave the rule.
Rule type chip: Shows "PII", "Content filter" or "Prompt injection" with a color-coded badge.
Action chip: Shows "Block" or "Mask".
Scope: Where the rule applies (currently "tool_input").
Tool scope: Either "Applies to all tools" or a list of specific tool names.
Tool chips: When scoped to specific tools, small chips show each tool name.
Active toggle: Turn the rule on or off without deleting it.

Inactive rules appear dimmed (60% opacity). Click any row to edit it.

Rule types

Three types of guardrail rules are available:

Rule type	Chip color	What it does
PII detection	Blue (info)	Scans tool inputs for personal identifiable information: email addresses, phone numbers, credit card numbers, SSNs, etc. Uses Presidio-based detection.
Content filter	Amber (warning)	Checks tool inputs against a list of keywords or regex patterns that you define. Useful for blocking profanity, competitor names or domain-specific terms.
Prompt injection	Green (success)	Detects attempts to manipulate the tool's behavior through injected instructions in the input data.

Actions

Each rule has an action that determines what happens when a match is found:

Block: Rejects the entire tool call. The agent receives a JSON-RPC error with code -32003 and a message explaining the guardrail violation. The call is logged as "blocked" in the audit trail.
Mask: Replaces the matched content with placeholders before forwarding the call to the server. The tool still executes, but with sanitized input.

Masking affects tool results

When you mask, the tool receives modified input, so it may produce less relevant results. For example, masking an email address in a search query means the tool won't find results for that address.

Creating a guardrail rule

Click Add guardrail in the top-right corner. The modal has these fields:

Field	Required	Description
Name	Yes	A descriptive name for the rule (e.g., "Block PII in database queries"). Max 255 characters.
Rule type	Yes	Select PII detection, Content filter or Prompt injection.
Action	Yes	Select Block or Mask.
Scope	Yes	Where to apply the rule. Currently only "Tool input" is available.
Applies to tools	No	Comma-separated tool names. Leave empty to apply to all MCP tools.
Config (JSON)	No	Optional JSON object for advanced rule settings. Format depends on the rule type.
Active	Yes	Toggle to enable or disable the rule immediately.

Scoping rules to specific tools

By default, a rule applies to all MCP tools. You can restrict it to specific tools by entering a comma-separated list of tool names in the "Applies to tools" field.

Examples:

Leave empty: the rule runs on every tool call.
Enter run_query, search_db: the rule only checks calls to those two tools.
Enter delete_record: the rule only checks calls to delete_record.

Targeted PII rules

You don't need to scan every tool for PII. Scope your PII rules to tools that interact with databases or external services. Tools like get_weather or list_colors probably don't need PII scanning.

Config JSON

The Config field accepts a JSON object for advanced rule settings. The format depends on the rule type:

PII detection config

json

{
  "entities": {
    "EMAIL_ADDRESS": "mask",
    "PHONE_NUMBER": "block",
    "CREDIT_CARD": "block"
  }
}

Define which PII entity types to detect and how to handle each one. If no config is provided, the default detection settings are used.

Content filter config

json

{
  "keywords": ["password", "secret", "confidential"],
  "patterns": ["\\b\\d{3}-\\d{2}-\\d{4}\\b"]
}

Define keywords (exact match) and regex patterns to flag in tool inputs.

Editing a rule

Click any rule row or the pencil icon to open the edit modal. All fields are editable. Click Save changes to apply. Changes take effect immediately for the next tool call.

Enabling and disabling rules

Each rule has an active toggle. Disabled rules are skipped during tool input scanning. This is useful for testing: create a rule, disable it, verify it works on the next tool call, then enable it.

Deleting a rule

Click the trash icon on a rule row. A confirmation modal appears: "This action takes effect immediately. MCP tool invocations will no longer be checked against this rule."

Deletion is permanent but you can re-create the rule at any time.

How guardrails execute

When an agent calls a tool through the gateway, the guardrail evaluation happens in this order:

The gateway authenticates the agent key and resolves the tool.
ACLs and rate limits are checked.
If the tool requires approval, the approval flow takes priority (guardrails run after approval).
All active guardrail rules that apply to this tool are evaluated against the input.
If any rule triggers a "block" action, the call is rejected immediately.
If any rule triggers a "mask" action, the matched content is replaced before forwarding.
The (possibly modified) input is forwarded to the backend MCP server.

Empty state

When no rules are configured, the page shows a few tips about what guardrails do:

Scan tool inputs before execution: Rules are evaluated against tool input data before the tool runs.
Scope rules to specific tools: Apply guardrails globally or restrict to specific MCP tools.
Multiple rule types: Choose from PII detection, content filtering or prompt injection detection.

Permissions

Creating, editing and deleting guardrail rules requires the Admin role. All authenticated users can view the rule list.

Activity

Blocked tool calls appear in the audit trail with "blocked" status.

MCP Tools

View which tools your guardrails apply to.

AI Gateway guardrails

The LLM-level guardrails that protect chat completion requests.

PreviousMCP approvals