ANALYSIS March 17, 2026 10 min read

OpenAI Agents Guardrails: What They Cover, What They Don't, and When You Still Need Tool Review

Developers are discovering that guardrails don't work exactly as expected. Teams assume a guardrail tripwire means no tools will run, but runtime behaviour can still start tool work before the exception returns.

The Guardrail Model: Three Layers, Different Boundaries

OpenAI's Agents SDK provides three types of guardrails, each with specific coverage boundaries that aren't immediately obvious from the API surface:

Input guardrails validate user input before (or during) agent execution. They're agent-level configurations that only trigger if that agent is the first in a chain. The key decision here is execution mode: parallel (default) or blocking.

Output guardrails validate the final agent output. They only run if the agent produces the final response in a chain — intermediate outputs from delegated agents skip output guardrails entirely.

Tool guardrails wrap individual function calls, running validation before and/or after execution. These are the most granular but also have the narrowest coverage.

Understanding these boundaries is essential because gaps between them are where security issues emerge in production.

The Parallel vs Blocking Problem

By default, input guardrails run in parallel with agent execution (run_in_parallel=True). This minimizes latency — both the guardrail and the agent start simultaneously. But this creates a race condition that has confused many developers.

If the guardrail triggers after the agent has already started, the agent may have consumed tokens and even initiated tool calls before the InputGuardrailTripwireTriggered exception halts execution. This isn't a bug — it's documented behaviour — but it's counterintuitive if you expect guardrails to act as a hard gate.

GitHub issue #889 describes exactly this: FileSearchTool running despite a triggered input guardrail. Issue #991 generalizes it: tool execution can continue after the exception is raised because the guardrail and agent are racing.

The solution is the blocking execution mode (run_in_parallel=False), which ensures the guardrail completes before the agent starts. This prevents any token consumption or tool execution if the guardrail triggers. The tradeoff is latency — you wait for the guardrail before any agent work begins.

When to use each mode:

What Tool Guardrails Actually Cover

Tool guardrails are configured on individual tools using the @function_tool decorator. They run on every invocation of that tool, making them useful for enforcing invariants at the call site.

But their coverage has significant limitations:

Covered: Custom function tools created with @function_tool

Not covered:

This means if your agent uses FileSearchTool to retrieve documents, those calls bypass tool guardrails entirely. If you rely on handoffs to delegate work, those handoff calls aren't guardrailed. If you expose an agent as a tool to another agent, that interface has no guardrail integration.

These aren't edge cases — they're common patterns in multi-agent systems. And they're exactly where malicious or accidental harmful behaviour can slip through.

The Coverage Gap Map

Here's what the guardrail system handles and where it leaves gaps:

ScenarioInput GuardrailOutput GuardrailTool Guardrail
Direct user input to first agent✅ (if blocking)N/AN/A
User input after handoffN/AN/A
Custom function tool callN/AN/A
Hosted tool call (FileSearch, etc.)N/AN/A
Handoff to another agentN/AN/A
Final agent outputN/AN/A
Intermediate agent outputN/AN/A

When You Still Need Tool Review

Guardrails operate at the orchestration layer — they validate inputs, outputs, and function call boundaries. They do not inspect the internal implementation of tools, the MCP servers they connect to, or the tool descriptions that guide agent behaviour.

This is where SkillShield fits. We scan:

Guardrails and SkillShield operate at different layers. Guardrails validate the flow of execution. SkillShield validates what gets executed. Production agent security needs both.

Close the Guardrail Gaps with SkillShield

Guardrails control execution flow. SkillShield inspects what executes. Together they provide defense in depth for production AI agents.

Scan Your Tools