Most people approach constraint-based prompting by trial and error: write a prompt, see what breaks, patch it, repeat. That works, slowly, but it produces no transferable knowledge. The next prompt starts from scratch. A framework changes that by giving you a repeatable sequence so the lessons from one prompt carry to the next.
Constraint-based output prompting is the practice of defining the exact shape and boundaries of a model's output rather than hoping usable structure emerges. The framework below, which we call Envelope, Boundaries, Priority, and Proof, organizes the decisions you have to make anyway into a deliberate order. Each stage answers a specific question, and skipping a stage tends to produce a specific failure.
Use it as scaffolding, not dogma. The order matters more than rigid adherence to any one step. The value of a named sequence is not that it is magic; it is that it externalizes decisions you would otherwise make implicitly and inconsistently. When the steps have names, you can tell a teammate which stage a prompt is failing at, you can review a prompt by checking each stage, and you can notice when you have skipped one. That shared vocabulary is most of the benefit.
One more reason to prefer a sequence over a checklist of tips: a sequence encodes dependencies. You cannot meaningfully define boundaries before you know the envelope, and you cannot prove anything before both exist. A flat list of best practices hides these dependencies and lets you tackle steps in an order that wastes effort, like writing elaborate test criteria for an output shape you later abandon. The framework's ordering is not arbitrary; each stage produces the input the next stage needs, which is what makes the whole thing efficient rather than merely thorough.
Stage One: Envelope
The question it answers
What is the literal shape of the output? Before anything else, decide the container: a JSON object with named keys, a fixed set of sections, a single value from a closed set.
How to apply it
Write a literal example of the output and put it in the prompt with "match this exactly." The envelope is the foundation; the later stages refine it. Treating output as an envelope with content inside, rather than content with structure sprinkled in, is the reframing that drove the result in What Tightening Output Rules Did for One Support Team.
Stage Two: Boundaries
The question it answers
What must never appear, and what is the allowed range of each field?
How to apply it
State exclusions ("no preamble, no markdown fences") and enumerate closed sets ("priority is one of low, medium, high"). Boundaries are where you prevent the model's default helpfulness from corrupting machine-readable output. The examples in Concrete Scenarios Where Output Constraints Earn Their Keep lean heavily on this stage.
When it matters most
Whenever output feeds another system. For purely human-facing output you can be lighter here.
Stage Three: Priority
The question it answers
When two constraints conflict, which one wins?
How to apply it
Make the trade-off explicit in the prompt: "Prefer accuracy over brevity." Without a stated priority, the model resolves conflicts unpredictably and your output drifts run to run. This stage is the antidote to the conflicting-constraint failure in Seven Ways Output Constraints Quietly Break Your Prompts.
When it matters most
Whenever you have stacked several constraints, especially length plus completeness, which almost always collide.
Stage Four: Proof
The question it answers
How do you know the constraints actually hold?
How to apply it
Build a test set of messy real inputs, define pass criteria, and measure. A constraint you cannot assert against is a hope. The KPIs in Reading the Signal: What to Track When Outputs Must Conform give you the instruments for this stage.
When it matters most
Always, but especially before any production deploy. Proof is the stage teams skip most and regret most.
Putting the Stages Together
Run them in order
Envelope defines the target, Boundaries protect it, Priority resolves its internal tensions, and Proof verifies it. Done out of order, you tune constraints you cannot measure or measure a shape you have not yet defined.
Re-enter stages as you learn
Proof often sends you back to Boundaries or Priority. That loop is the framework working, not failing. Each pass tightens the prompt and teaches you something reusable.
Walking the Framework Through a Real Task
A concrete pass
Suppose you need the model to triage a bug report into a structured ticket. Envelope: define a JSON object with the keys severity, component, and summary, and show a literal example. Boundaries: severity is one of low, medium, high, or critical; component comes from an allowlist; no preamble, output only the object. Priority: if severity and brevity conflict because the report is sprawling, prefer an accurate severity over a short summary. Proof: assemble fifty real reports including vague and duplicate ones, define pass criteria, and measure.
Where each stage saves you
If you skipped Envelope, you would get inconsistent ticket shapes that no script could file. Skip Boundaries and you would see invented components and chatty preambles. Skip Priority and the severity would drift on long reports. Skip Proof and you would ship a prompt that worked on the three clean reports you happened to try and failed on the messy fourth. Naming the stages makes each of these omissions visible before it ships, which is the same discipline the pre-flight list enforces from a different angle.
Adapting the Framework to Your Context
Compress it for simple tasks
For a trivial prompt, the four stages collapse into a few minutes of thought: a one-line envelope, one exclusion, no real conflicts to prioritize, and a handful of test inputs. The framework should not feel heavy on small work. Its weight should scale with the stakes, staying light where failure is cheap and rigorous where failure is expensive, exactly the calibration argued in Choosing How Tight to Make Your Output Rules.
Expand it for agent chains
When constrained output feeds another model call, each stage gets heavier. The envelope becomes an inter-agent contract, the boundaries must anticipate adversarial upstream output, and proof must test the whole chain, not just one link. The framework still applies; it simply demands more at each step because the cost of a leak compounds across the chain.
Make it a team artifact
The framework delivers the most value when a whole team shares it, because then the vocabulary travels. New prompts get reviewed stage by stage, regressions get diagnosed by stage, and onboarding a new prompt author means teaching four named ideas rather than a folklore of tricks. That shared language is what turns scattered prompt craft into a repeatable engineering practice.
Frequently Asked Questions
Why start with the Envelope instead of the content?
Because the container determines everything downstream. If you do not know the exact output shape, you cannot define boundaries, resolve conflicts, or write pass criteria. The envelope is the anchor.
What distinguishes Boundaries from Envelope?
Envelope is the positive shape (these keys, this structure). Boundaries are the negatives and ranges (no preamble, values only from this set). Together they fully specify the output.
Do I always need the Priority stage?
Only when constraints can conflict. A single simple constraint needs no priority rule. But the moment you stack length, completeness, and tone, you almost certainly do.
How is Proof different from just looking at outputs?
Proof uses a fixed, messy test set and written pass criteria, so results are repeatable and comparable. Eyeballing a few clean outputs is exactly the habit that ships fragile prompts.
Can I skip stages for simple prompts?
You can compress them, but do not skip Proof. Even simple prompts benefit from a few adversarial test inputs. The other stages scale down naturally for trivial tasks.
Does this framework depend on a particular model?
No. The stages are model-agnostic. The specific phrasing within each stage may need retuning across models, but the decision sequence stays the same.
Key Takeaways
- The framework has four ordered stages: Envelope, Boundaries, Priority, Proof.
- Envelope defines the literal output shape with a shown example.
- Boundaries state exclusions and enumerate closed sets to protect the shape.
- Priority makes constraint conflicts explicit so output stops drifting.
- Proof verifies constraints against messy inputs with written pass criteria.
- Expect to loop back from Proof to earlier stages; that loop is the method working.