The SEAL Model: Four Stages for Containing LLM Injection

Teams that fight prompt injection one bug at a time tend to lose. They patch the chatbot that leaked a system prompt, then get hit through a retrieved document, then through a tool output, each time treating the symptom as if it were the disease. The disease is structural: untrusted text and trusted instructions share the same channel, and the model cannot reliably tell them apart.

What helps is a shared model that tells you where each control belongs and why. This article introduces SEAL—Separate, Enforce, Audit, Limit—a four-stage framework for organizing prompt injection defense. It is not a product or a standard; it is a mental scaffold so your team stops arguing about individual fixes and starts reasoning about coverage.

Use SEAL to map your existing controls, find the gaps, and decide what to build next. If you want a flat list to run against a release, pair it with the Prompt Injection Defense Checklist for 2026.

Why a Framework Beats a Grab Bag of Tips

A list of tips answers "what could I do." A framework answers "what must I cover." The difference matters because injection attacks probe for the layer you forgot. Without a model, you tend to over-invest in the visible layer—the prompt—and under-invest in the layers attackers actually exploit, like tool permissions and output handling.

SEAL forces balance by naming four distinct jobs. A control either separates trust, enforces policy, audits behavior, or limits damage. If you cannot place a control in one of those stages, it is probably decoration.

The Four Stages

Stage 1: Separate

Separation is about keeping untrusted data out of the instruction channel. This is the foundation; skip it and the other three stages are damage control.

Position matters. Put system instructions and user-supplied data in distinct, clearly labeled regions of the context.
Mark data as inert. Tell the model that delimited content is material to analyze, never commands to obey.
Sanitize carriers. Strip invisible Unicode, HTML comments, and link tricks from retrieved content before it reaches the model.

Separation does not make the model trustworthy. It makes the boundary legible so the later stages have something to enforce.

Stage 2: Enforce

Enforcement turns intent into hard constraints the model cannot talk its way around.

Allowlist tools per state. Enumerate which tools are callable in which contexts; deny everything else by default.
Gate irreversible actions. Route money movement, deletions, and outbound messages through deterministic approval, not model judgment.
Constrain output shape. Require structured output with a fixed schema so downstream code, not prose, drives behavior.

Enforcement is where most teams find their biggest gap. A model often has far more reach than the feature needs.

Stage 3: Audit

Auditing assumes some attacks will get through and asks: will you see them?

Log prompts, completions, and tool calls for high-risk flows, with secrets redacted but context preserved.
Alert on anomalies—unusual tool sequences, output length spikes, repeated refusals—rather than only on errors.
Replay attacks regularly against a red-team suite so you measure whether your defenses are improving.

The metrics that make auditing actionable are covered in How to Measure Prompt Injection Defense: Metrics That Matter.

Stage 4: Limit

Limiting shrinks the blast radius of any successful injection so a breach becomes an inconvenience instead of a headline.

Least privilege credentials. Run tool calls with the requesting user's permissions, never the agent's superset.
Rate limits and budgets. Cap tool invocations and spend so a hijacked agent cannot run away.
Kill switches. Be able to disable a tool or agent without a deploy.

How the Stages Reinforce Each Other

The stages are not independent silos; each compensates for the limits of the others. Separation is strong at keeping casual instructions out of the command channel but cannot stop a model that has already been confused. Enforcement is strong at constraining actions but only enforces the policies you remember to write. Auditing catches what slips through enforcement but does nothing on its own to stop an attack in flight. Limit bounds the damage of everything the first three stages miss.

Read together, they form a deliberate sequence of fallbacks. An attack that defeats Separate runs into Enforce. One that defeats Enforce shows up in Audit. One that evades Audit is still capped by Limit. This layered redundancy is why no single stage needs to be perfect—a relief, because none of them can be. The failure of any one layer degrades safety gracefully rather than catastrophically, which is the entire point of organizing defense as a stack instead of a wall.

The most common architectural mistake is treating these as alternatives—choosing detection or containment rather than building both. SEAL exists partly to make that false choice visible. If your stack has strong Audit and weak Limit, you will know about every breach in detail and be unable to stop any of them from causing harm.

Applying SEAL in Practice

Mapping your current state

Take any model-facing feature and list every control you have. Sort each into Separate, Enforce, Audit, or Limit. Empty columns are your roadmap. Most teams find Separate and Enforce thin and Audit nearly absent.

Deciding what to build first

Sequence by leverage, not by ease. Separation is foundational, so it comes first. Then Limit, because it caps damage even when other stages fail. Enforce and Audit follow. This ordering means that even a half-finished implementation fails safe.

Knowing when each stage applies

Not every feature needs every control at full strength. A read-only summarizer needs strong Separation and light Limit. An agent that can move money needs all four at maximum. Let the worst realistic outcome set the intensity. For deeper treatment of these decisions, see Prompt Injection Defense: Trade-offs, Options, and How to Decide.

A worked example

Consider a customer-support agent that can read a knowledge base, look up an order, and issue a refund. Walk it through SEAL. Separate: the knowledge base and order data are untrusted, so they go in delimited, sanitized regions labeled as data. Enforce: the agent may call lookup freely but refunds route through a deterministic check on amount and eligibility that the model cannot bypass. Audit: every refund attempt and tool call is logged and alerts fire on unusual refund frequency. Limit: lookups run with the requesting customer's scope, refunds are capped per day, and a kill switch can disable the refund tool without a deploy.

Notice how the worst outcome—an unauthorized refund—gets three independent barriers across Enforce, Audit, and Limit, while the low-risk lookup gets light treatment. That asymmetry is SEAL working as intended: concentrate defense where the damage concentrates, and avoid burdening harmless capabilities with controls they do not need.

Frequently Asked Questions

How is SEAL different from a generic security model?

SEAL is specialized for the one property that makes language models unusual: they cannot reliably distinguish instructions from data. Generic models like defense-in-depth still apply, but they do not tell you that Separate must come first or that Limit deserves priority over Enforce. SEAL encodes injection-specific sequencing.

Do I need to implement all four stages before launching?

You need Separate and at least basic Limit before any feature that touches untrusted input goes live. Enforce and Audit can mature after launch for low-risk features, but high-stakes agents that take real actions should have all four stages working from day one.

Where do most teams have the biggest gap?

Enforce and Audit. Teams naturally invest in the prompt because it is visible and easy to edit. They under-invest in tool allowlisting and in logging, which are the controls that actually catch and contain real attacks. SEAL makes those gaps obvious by leaving columns empty.

Can SEAL handle indirect injection through documents?

Yes, and that is much of the point. The Separate stage explicitly covers sanitizing and delimiting retrieved content, which is the primary vector for indirect attacks. Audit then watches for the anomalous behavior those documents can trigger.

Key Takeaways

SEAL organizes prompt injection defense into four stages: Separate, Enforce, Audit, Limit.
Separation is foundational—keep untrusted data out of the instruction channel first.
Sequence implementation by leverage so a partial build still fails safe.
Map existing controls to the four stages to expose gaps; Enforce and Audit are usually thin.
Let the worst realistic outcome dictate how intensely each stage is implemented.

Use SEAL to map your existing controls, find the gaps, and decide what to build next. If you want a flat list to run against a release, pair it with the Prompt Injection Defense Checklist for 2026.

Why a Framework Beats a Grab Bag of Tips

The Four Stages

Stage 1: Separate

Separation is about keeping untrusted data out of the instruction channel. This is the foundation; skip it and the other three stages are damage control.

Position matters. Put system instructions and user-supplied data in distinct, clearly labeled regions of the context.
Mark data as inert. Tell the model that delimited content is material to analyze, never commands to obey.
Sanitize carriers. Strip invisible Unicode, HTML comments, and link tricks from retrieved content before it reaches the model.

Separation does not make the model trustworthy. It makes the boundary legible so the later stages have something to enforce.

Stage 2: Enforce

Enforcement turns intent into hard constraints the model cannot talk its way around.

Allowlist tools per state. Enumerate which tools are callable in which contexts; deny everything else by default.
Gate irreversible actions. Route money movement, deletions, and outbound messages through deterministic approval, not model judgment.
Constrain output shape. Require structured output with a fixed schema so downstream code, not prose, drives behavior.

Enforcement is where most teams find their biggest gap. A model often has far more reach than the feature needs.

Stage 3: Audit

Auditing assumes some attacks will get through and asks: will you see them?

Log prompts, completions, and tool calls for high-risk flows, with secrets redacted but context preserved.
Alert on anomalies—unusual tool sequences, output length spikes, repeated refusals—rather than only on errors.
Replay attacks regularly against a red-team suite so you measure whether your defenses are improving.

The metrics that make auditing actionable are covered in How to Measure Prompt Injection Defense: Metrics That Matter.

Stage 4: Limit

Limiting shrinks the blast radius of any successful injection so a breach becomes an inconvenience instead of a headline.

Least privilege credentials. Run tool calls with the requesting user's permissions, never the agent's superset.
Rate limits and budgets. Cap tool invocations and spend so a hijacked agent cannot run away.
Kill switches. Be able to disable a tool or agent without a deploy.

How the Stages Reinforce Each Other

Applying SEAL in Practice

Mapping your current state

Deciding what to build first

Knowing when each stage applies

A worked example

Frequently Asked Questions

How is SEAL different from a generic security model?

Do I need to implement all four stages before launching?

Where do most teams have the biggest gap?

Can SEAL handle indirect injection through documents?

Key Takeaways

SEAL organizes prompt injection defense into four stages: Separate, Enforce, Audit, Limit.
Separation is foundational—keep untrusted data out of the instruction channel first.
Sequence implementation by leverage so a partial build still fails safe.
Map existing controls to the four stages to expose gaps; Enforce and Audit are usually thin.
Let the worst realistic outcome dictate how intensely each stage is implemented.

The SEAL Model: Four Stages for Containing LLM Injection

Why a Framework Beats a Grab Bag of Tips

The Four Stages

Stage 1: Separate

Stage 2: Enforce

Stage 3: Audit

Stage 4: Limit

How the Stages Reinforce Each Other

Applying SEAL in Practice

Mapping your current state

Deciding what to build first

Knowing when each stage applies

A worked example

Frequently Asked Questions

How is SEAL different from a generic security model?

Do I need to implement all four stages before launching?

Where do most teams have the biggest gap?

Can SEAL handle indirect injection through documents?

Key Takeaways

Agency Script Editorial

Related Articles

Prompt Quality Decides Whether AI Earns Its Keep

Counting the Real Cost of Every Token You Send

Rolling Out AI Hallucinations Across a Team

Ready to certify your AI capability?

The SEAL Model: Four Stages for Containing LLM Injection

Why a Framework Beats a Grab Bag of Tips

The Four Stages

Stage 1: Separate

Stage 2: Enforce

Stage 3: Audit

Stage 4: Limit

How the Stages Reinforce Each Other

Applying SEAL in Practice

Mapping your current state

Deciding what to build first

Knowing when each stage applies

A worked example

Frequently Asked Questions

How is SEAL different from a generic security model?

Do I need to implement all four stages before launching?

Where do most teams have the biggest gap?

Can SEAL handle indirect injection through documents?

Key Takeaways

Agency Script Editorial

Related Articles

Prompt Quality Decides Whether AI Earns Its Keep

Counting the Real Cost of Every Token You Send

Rolling Out AI Hallucinations Across a Team

Ready to certify your AI capability?