Building a Contrastive Prompt One Boundary at a Time

Understanding why contrast works is one thing; sitting down and actually building a contrastive prompt that resolves a specific ambiguity is another. This article is the second kind: a do-this-then-that process you can follow today on a prompt that is currently giving you defensible-but-wrong outputs. Each step produces something concrete, and the steps are ordered so that earlier ones feed the later ones.

The process is deliberately sequential because the most common mistake is jumping straight to writing examples before identifying what the ambiguity actually is. If you do not know precisely which interpretation you are trying to rule out, your examples will vary in too many ways and teach the model the wrong lesson. We will start by pinning down the ambiguity, then build the contrast around it.

Work through this with a real prompt of your own open in front of you. The steps assume you have an instruction that produces output you do not want, and that the wrongness comes from the model interpreting your intent differently than you meant it.

Step One: Locate the Exact Ambiguity

You cannot resolve what you have not named.

Find the divergence

Run your current prompt several times and look at the outputs that are wrong. Then articulate, in one sentence, the specific interpretation the model is choosing that you did not intend. "It is treating professional as formal-and-long when I mean clear-and-brief" is a usable sentence. "It is bad" is not.

What this step produces

A one-sentence statement of the wrong interpretation.
A clear sense of which single dimension is being misread.
The target boundary you will teach in later steps.

Step Two: Collect Real Failure Examples

Your negatives should come from the model, not your imagination.

Harvest actual wrong outputs

From the runs in step one, save two or three outputs that exemplify the wrong interpretation. These are your candidate negatives, and because they are real, they target the model's genuine tendencies rather than imagined ones.

What to keep

Outputs that are plausible but clearly the wrong reading.
Variety within the wrong interpretation, so you can pick the cleanest example.
Notes on what specifically makes each one wrong.

Step Three: Write the Positive Example

Now define the target precisely.

Make the positive a near-twin of the negative

Write the output you actually want for the same input as one of your negatives. Keeping the input identical is what lets the model isolate the distinction. This mirrors the format-teaching logic in Teach a Model Your Format Without Writing Code, with a sharper boundary.

Quality checks

The positive and negative should share the same input.
They should differ only in the dimension you named in step one.
The positive should be unambiguously what you want, not a compromise.

Step Four: Isolate a Single Variable

This is the step that makes or breaks the contrast.

Strip out incidental differences

Compare your positive and negative. If they differ in more than the target dimension, edit them until the only meaningful difference is the one you are teaching. A pair that differs in length and tone and structure teaches a muddled lesson.

How to enforce isolation

List every way the two examples differ.
For each difference that is not your target, make the two match.
Confirm that what remains is exactly the distinction from step one.

Step Five: Assemble the Contrastive Prompt

Put the pieces together explicitly.

Structure for clarity

State the task, present the negative labeled as undesired, present the positive labeled as desired, and briefly name the reason. Then give the real input. Explicit labels prevent the model from guessing which example to emulate.

A reusable skeleton

Task instruction.
"Avoid outputs like this:" plus the negative.
"Produce outputs like this:" plus the positive.
A one-line reason for the distinction, then the actual input.

Step Six: Validate on Held-Out Inputs

Test where it counts: on inputs you did not use to build the prompt.

Confirm generalization

Run the assembled prompt on several fresh inputs near the boundary. Success is the model applying the distinction to inputs it has not seen, not just reproducing your examples. The validation mindset echoes Running a Complex Task Through One Sub-Prompt at a Time, where each step is checked before moving on.

What to watch for

Generalization: does it work on new inputs near the boundary?
Overcorrection: has the model swung too far toward avoiding the negative?
New ambiguity: did fixing this reading open a different one?

Step Seven: Iterate or Lock

Decide whether the contrast is done.

The exit criterion

If the prompt handles held-out inputs correctly without overcorrecting, lock it and document the pair. If not, return to the step that failed: a generalization miss usually means the negative was unrealistic, and overcorrection usually means the contrast was too extreme.

Documentation to keep

The ambiguity statement from step one.
The final contrastive pair and the reason for it.
The held-out inputs used to validate, for future regression checks.

A Full Pass on One Example

To make the sequence concrete, run it once end to end.

The case

A model summarizing customer reviews keeps including the reviewer's name and date, which you do not want. You want only the substance of the feedback.

The pass

Step one: the ambiguity is "it treats metadata as part of the summary when I want substance only."
Step two: collect three real summaries that include names and dates.
Step three: write the desired metadata-free summary for the same review.
Step four: ensure the only difference between desired and undesired is the metadata, not length or tone.
Step five: assemble the prompt with both labeled and the reason "exclude reviewer identity and timing."
Step six: validate on reviews you did not use, confirming names and dates are dropped without losing substance.

This single worked pass is the template for any ambiguity you face.

Handling Stubborn Ambiguities

Some ambiguities resist a single pair.

When one pair is not enough

Occasionally the model fixes the reported case but still stumbles on a distinct, related reading. The instinct to pile on pairs is dangerous, so proceed carefully.

The disciplined extension

Confirm the second issue is genuinely distinct, not the same ambiguity reappearing.
Build a second, separately isolated pair for it, holding everything else constant.
Re-validate both distinctions together to ensure the pairs do not conflict, a caution echoed in Seven Ways Task Decomposition Quietly Sabotages Your Prompts.

Frequently Asked Questions

What if I cannot articulate the ambiguity in one sentence?

Then you are not ready to write examples yet. Keep running the prompt and studying its wrong outputs until you can name the specific interpretation it is choosing. The sentence is the prerequisite, not an optional nicety.

Should the positive and negative use the same input?

Yes, whenever possible. Sharing the input is what lets the model isolate the one dimension that differs. Different inputs introduce noise that can teach the wrong distinction.

How many contrastive pairs should the final prompt have?

Often one well-isolated pair is enough. Add a second only if a distinct ambiguity remains after validation. More pairs increase the risk of teaching conflicting lessons.

My fix worked on the examples but not on new inputs. What went wrong?

The lesson did not generalize, usually because the negative was unrealistic or the pair differed in too many ways. Return to steps two and four, use a real failure, and isolate a single variable.

How do I tell overcorrection from a correct fix?

Overcorrection shows up as the model avoiding the negative so hard that the output becomes distorted in the opposite direction. If brief became uselessly terse, you overcorrected; soften the contrast.

Do I need to keep the held-out inputs after locking the prompt?

Yes. They become your regression set. If you later change the prompt or the model updates, re-running those inputs tells you whether the disambiguation still holds.

Key Takeaways

Name the exact wrong interpretation in one sentence before writing any examples.
Source negatives from the model's real failures, not from imagination.
Build the positive on the same input as the negative so the distinction is isolated.
Strip every incidental difference until only the target dimension varies.
Validate on held-out inputs to confirm the distinction generalizes without overcorrecting.
Keep the validation inputs as a regression set for future changes.

Step One: Locate the Exact Ambiguity

You cannot resolve what you have not named.

Find the divergence

What this step produces

A one-sentence statement of the wrong interpretation.
A clear sense of which single dimension is being misread.
The target boundary you will teach in later steps.

Step Two: Collect Real Failure Examples

Your negatives should come from the model, not your imagination.

Harvest actual wrong outputs

What to keep

Outputs that are plausible but clearly the wrong reading.
Variety within the wrong interpretation, so you can pick the cleanest example.
Notes on what specifically makes each one wrong.

Step Three: Write the Positive Example

Now define the target precisely.

Make the positive a near-twin of the negative

Quality checks

The positive and negative should share the same input.
They should differ only in the dimension you named in step one.
The positive should be unambiguously what you want, not a compromise.

Step Four: Isolate a Single Variable

This is the step that makes or breaks the contrast.

Strip out incidental differences

How to enforce isolation

List every way the two examples differ.
For each difference that is not your target, make the two match.
Confirm that what remains is exactly the distinction from step one.

Step Five: Assemble the Contrastive Prompt

Put the pieces together explicitly.

Structure for clarity

A reusable skeleton

Task instruction.
"Avoid outputs like this:" plus the negative.
"Produce outputs like this:" plus the positive.
A one-line reason for the distinction, then the actual input.

Step Six: Validate on Held-Out Inputs

Test where it counts: on inputs you did not use to build the prompt.

Confirm generalization

What to watch for

Generalization: does it work on new inputs near the boundary?
Overcorrection: has the model swung too far toward avoiding the negative?
New ambiguity: did fixing this reading open a different one?

Step Seven: Iterate or Lock

Decide whether the contrast is done.

The exit criterion

Documentation to keep

The ambiguity statement from step one.
The final contrastive pair and the reason for it.
The held-out inputs used to validate, for future regression checks.

A Full Pass on One Example

To make the sequence concrete, run it once end to end.

The case

A model summarizing customer reviews keeps including the reviewer's name and date, which you do not want. You want only the substance of the feedback.

The pass

Step one: the ambiguity is "it treats metadata as part of the summary when I want substance only."
Step two: collect three real summaries that include names and dates.
Step three: write the desired metadata-free summary for the same review.
Step four: ensure the only difference between desired and undesired is the metadata, not length or tone.
Step five: assemble the prompt with both labeled and the reason "exclude reviewer identity and timing."
Step six: validate on reviews you did not use, confirming names and dates are dropped without losing substance.

This single worked pass is the template for any ambiguity you face.

Handling Stubborn Ambiguities

Some ambiguities resist a single pair.

When one pair is not enough

Occasionally the model fixes the reported case but still stumbles on a distinct, related reading. The instinct to pile on pairs is dangerous, so proceed carefully.

The disciplined extension

Confirm the second issue is genuinely distinct, not the same ambiguity reappearing.
Build a second, separately isolated pair for it, holding everything else constant.
Re-validate both distinctions together to ensure the pairs do not conflict, a caution echoed in Seven Ways Task Decomposition Quietly Sabotages Your Prompts.

Frequently Asked Questions

What if I cannot articulate the ambiguity in one sentence?

Should the positive and negative use the same input?

Yes, whenever possible. Sharing the input is what lets the model isolate the one dimension that differs. Different inputs introduce noise that can teach the wrong distinction.

How many contrastive pairs should the final prompt have?

Often one well-isolated pair is enough. Add a second only if a distinct ambiguity remains after validation. More pairs increase the risk of teaching conflicting lessons.

My fix worked on the examples but not on new inputs. What went wrong?

The lesson did not generalize, usually because the negative was unrealistic or the pair differed in too many ways. Return to steps two and four, use a real failure, and isolate a single variable.

How do I tell overcorrection from a correct fix?

Do I need to keep the held-out inputs after locking the prompt?

Yes. They become your regression set. If you later change the prompt or the model updates, re-running those inputs tells you whether the disambiguation still holds.

Key Takeaways

Name the exact wrong interpretation in one sentence before writing any examples.
Source negatives from the model's real failures, not from imagination.
Build the positive on the same input as the negative so the distinction is isolated.
Strip every incidental difference until only the target dimension varies.
Validate on held-out inputs to confirm the distinction generalizes without overcorrecting.
Keep the validation inputs as a regression set for future changes.

Building a Contrastive Prompt One Boundary at a Time

Step One: Locate the Exact Ambiguity

Find the divergence

What this step produces

Step Two: Collect Real Failure Examples

Harvest actual wrong outputs

What to keep

Step Three: Write the Positive Example

Make the positive a near-twin of the negative

Quality checks

Step Four: Isolate a Single Variable

Strip out incidental differences

How to enforce isolation

Step Five: Assemble the Contrastive Prompt

Structure for clarity

A reusable skeleton

Step Six: Validate on Held-Out Inputs

Confirm generalization

What to watch for

Step Seven: Iterate or Lock

The exit criterion

Documentation to keep

A Full Pass on One Example

The case

The pass

Handling Stubborn Ambiguities

When one pair is not enough

The disciplined extension

Frequently Asked Questions

What if I cannot articulate the ambiguity in one sentence?

Should the positive and negative use the same input?

How many contrastive pairs should the final prompt have?

My fix worked on the examples but not on new inputs. What went wrong?

How do I tell overcorrection from a correct fix?

Do I need to keep the held-out inputs after locking the prompt?

Key Takeaways

Agency Script Editorial

Related Articles

Prompt Quality Decides Whether AI Earns Its Keep

Counting the Real Cost of Every Token You Send

Rolling Out AI Hallucinations Across a Team

Ready to certify your AI capability?

Building a Contrastive Prompt One Boundary at a Time

Step One: Locate the Exact Ambiguity

Find the divergence

What this step produces

Step Two: Collect Real Failure Examples

Harvest actual wrong outputs

What to keep

Step Three: Write the Positive Example

Make the positive a near-twin of the negative

Quality checks

Step Four: Isolate a Single Variable

Strip out incidental differences

How to enforce isolation

Step Five: Assemble the Contrastive Prompt

Structure for clarity

A reusable skeleton

Step Six: Validate on Held-Out Inputs

Confirm generalization

What to watch for

Step Seven: Iterate or Lock

The exit criterion

Documentation to keep

A Full Pass on One Example

The case

The pass

Handling Stubborn Ambiguities

When one pair is not enough

The disciplined extension

Frequently Asked Questions

What if I cannot articulate the ambiguity in one sentence?

Should the positive and negative use the same input?

How many contrastive pairs should the final prompt have?

My fix worked on the examples but not on new inputs. What went wrong?

How do I tell overcorrection from a correct fix?

Do I need to keep the held-out inputs after locking the prompt?

Key Takeaways

Agency Script Editorial

Related Articles