AGENCYSCRIPT
CoursesEnterpriseBlog
đź‘‘FoundersSign inJoin Waitlist
AGENCYSCRIPT

Governed Certification Framework

The operating system for AI-enabled agency building. Certify judgment under constraint. Standards over scale. Governance over shortcuts.

Stay informed

Governance updates, certification insights, and industry standards.

Products

  • Platform
  • Certification
  • Launch Program
  • Vault
  • The Book

Certification

  • Foundation (AS-F)
  • Operator (AS-O)
  • Architect (AS-A)
  • Principal (AS-P)

Resources

  • Blog
  • Verify Credential
  • Enterprise
  • Partners
  • Pricing

Company

  • About
  • Contact
  • Careers
  • Press
© 2026 Agency Script, Inc.·
Privacy PolicyTerms of ServiceCertification AgreementSecurity

Standards over scale. Judgment over volume. Governance over shortcuts.

On This Page

Define the Standard Before You Touch the ModelThe reasoningHow to do it wellTreat the First Draft as a ProbeThe reasoningHow to do it wellCritique to a Location and a DirectionThe reasoningHow to do it wellChange Exactly One Thing Per IterationThe reasoningHow to do it wellUse the Model as a Critic, Not a JudgeThe reasoningHow to do it wellSteer, Never Re-RollThe reasoningHow to do it wellStop on the Standard, Not on FatigueThe reasoningHow to do it wellKeep a Running Pass/Fail PictureThe reasoningHow to do it wellSeparate the Critic From the Author in Your Own HeadThe reasoningHow to do it wellMatch the Loop's Intensity to the StakesThe reasoningHow to do it wellFrequently Asked QuestionsIf I only adopt one of these, which should it be?Is changing one thing at a time really worth the slowdown?How do I balance the model's critique against my own judgment?What if my standard turns out to be wrong mid-loop?Do these practices work for non-writing tasks?How do I get a team to follow these consistently?Key Takeaways
Home/Blog/Opinionated Rules for Refinement Loops That Converge
General

Opinionated Rules for Refinement Loops That Converge

A

Agency Script Editorial

Editorial Team

·September 1, 2020·8 min read
prompting for iterative refinement loopsprompting for iterative refinement loops best practicesprompting for iterative refinement loops guideprompt engineering

Generic advice about refinement loops is easy to find and useless in practice. "Iterate until it is good" tells you nothing about how to make the loop actually converge. The practices in this article are the opposite: specific, opinionated, and each paired with the reasoning that justifies it. They come from watching loops succeed and fail, and noticing what the successful ones consistently did.

You will not agree with every position here, and that is fine; opinionated practice invites argument. But each one earns its place by solving a real problem that vague advice ignores. Where a practice connects to a deeper treatment, the link is there, but the practices stand on their own.

If you want the failure-mode framing of these same ideas, Seven Ways Refinement Loops Quietly Go Off the Rails covers what goes wrong when you ignore them. This piece is about what to do.

Define the Standard Before You Touch the Model

The first and least negotiable practice: write down what good looks like before generating anything.

The reasoning

A loop converges toward a target. With no written target, every draft redefines the goal, and the loop circles forever. The standard is your finish line, and you cannot finish a race with no finish line.

How to do it well

  • Write three to five qualities you can check by pointing at the output.
  • Tie them to the specific task and reader, not to generic quality.
  • Keep them visible throughout the loop.

The same discipline underpins audience work, where the standard is reader-fitness; see Tailoring Prompts to Readers: Direct Answers to Real Questions.

Treat the First Draft as a Probe

Do not invest in a perfect first prompt. Generate something to react to.

The reasoning

You cannot predict everything good about an output in advance. The draft reveals what you could not anticipate. Reacting to a concrete draft is faster and more accurate than predicting, so getting a draft quickly beats perfecting the prompt slowly.

How to do it well

  • Write a plain, simple first request.
  • Generate and read without judgment of the prompt itself.
  • Save your specificity for the critique, where it has a target.

This probe-first mindset is the on-ramp taught in Refining Model Output by Looping: A Plain Introduction.

Critique to a Location and a Direction

Every critique should name where the problem is and which way to fix it.

The reasoning

The model can only act on what you give it. "Make it better" gives nothing; "tighten the second paragraph to two sentences" gives a location and a direction. Specific critique produces specific revision, which is the only kind that converges.

How to do it well

  • Point at a sentence or section.
  • Name which standard item it misses.
  • State the direction the fix should go.

Vague critique is one of the most common loop-killers, which is why precision here pays off so reliably.

Change Exactly One Thing Per Iteration

This is the practice people resist most and benefit from most.

The reasoning

Changing one thing keeps cause and effect legible. When the output improves, you know what helped; when it regresses, you can cleanly revert. Batching changes hides interactions and destroys your ability to learn from the loop.

How to do it well

  • Pick the single highest-value failure each iteration.
  • Instruct the change and explicitly preserve the rest.
  • Verify before moving to the next failure.

It feels slower and converges faster, a trade the step-by-step procedure builds in deliberately.

Use the Model as a Critic, Not a Judge

Let the model critique, but never hand it the verdict.

The reasoning

Model self-critique surfaces real issues you might miss, which is valuable. But models rationalize their own drafts and approve work a reader would reject. If the model both writes and judges, the loop converges on the model's taste, not the reader's need.

How to do it well

  • Ask for critique to generate candidate issues.
  • Filter those candidates against your human-held standard.
  • Keep the final pass/fail decision yours.

Steer, Never Re-Roll

When a draft disappoints, react to it rather than discarding it.

The reasoning

Re-rolling is random; it does not converge because it ignores what was wrong. Steering uses the specific gap to push the next draft in a known direction. A hundred re-rolls can leave you no closer; three steered revisions usually arrive.

How to do it well

  • Critique the disappointing draft instead of regenerating blindly.
  • Feed the specific gap back as a targeted instruction.
  • Reserve full regeneration for drafts too far off to react to.

Stop on the Standard, Not on Fatigue

End the loop deliberately, the moment the standard passes.

The reasoning

Past the standard, additional loops trade real time for gains nobody notices. The polish trap is real and expensive. A defined standard is also a defined stopping point, which is one more reason it has to come first.

How to do it well

  • Stop when every standard item passes.
  • Stop when revisions stop producing noticeable improvement.
  • Resist the urge to keep tweaking once you have arrived.

Keep a Running Pass/Fail Picture

Across a multi-step loop, track state explicitly.

The reasoning

Working memory leaks over many iterations. Without an explicit record, you re-introduce solved problems and lose track of what still fails. A visible pass/fail picture keeps the loop moving forward instead of circling.

How to do it well

  • After each revision, update which standard items pass.
  • Note any regression the change introduced.
  • Use the picture to choose the next target and to recognize when you are done.

Separate the Critic From the Author in Your Own Head

A subtler practice: when you critique your own loop, deliberately switch roles rather than staying in author mode.

The reasoning

The mindset that produces a draft is invested in it. That same mindset, asked to critique, tends to defend rather than scrutinize. Authors rationalize; critics interrogate. If you critique while still wearing the author hat, you will excuse problems a fresh reader would flag.

How to do it well

  • Before critiquing, consciously step out of the author role and read as the target reader would.
  • Critique against the standard, not against the effort you put into the draft.
  • Treat your own sunk cost as irrelevant to whether the output passes.

This is the human-side mirror of using the model as a critic but not a judge. Both are about keeping the evaluation honest by separating the thing that makes from the thing that judges.

Match the Loop's Intensity to the Stakes

Not every output deserves the same number of loops, and a good practitioner calibrates.

The reasoning

A throwaway internal note and a flagship published piece have different fitness bars. Running a flagship-grade loop on a throwaway wastes time; running a throwaway loop on a flagship ships something embarrassing. The practice is to set the standard's height to the stakes, then loop to that height.

How to do it well

  • For low-stakes work, write a short standard and accept the first draft that clears it.
  • For high-stakes work, write a fuller standard and expect more iterations.
  • Let the standard, not your mood or available time, decide how much polish is warranted.

Calibrating this way protects you from both under-investing in what matters and over-polishing what does not. It is the same stop-on-the-standard discipline applied at the level of how high to set the bar in the first place.

Frequently Asked Questions

If I only adopt one of these, which should it be?

Define the standard before you generate. It is the foundation the rest depend on: it makes critique specific, tells you when to stop, and gives you something to track. Adopting this one practice prevents the worst failures even if you ignore the others initially.

Is changing one thing at a time really worth the slowdown?

Yes, because the slowdown is an illusion. One change at a time converges faster overall because every step is legible and reversible. Batching changes feels fast but routinely sends loops circling when a regression appears and you cannot tell what caused it.

How do I balance the model's critique against my own judgment?

Use the model's critique to widen your list of candidate issues, then judge those candidates against your own standard. The model is good at spotting things you missed and bad at being impartial about its own work. Keep it as a source, not a judge.

What if my standard turns out to be wrong mid-loop?

Update it deliberately and note that you did. A standard can be refined as you learn, and that is healthy. The failure mode is letting it drift silently so every draft redefines the goal. Changing it on purpose is fine; letting it dissolve is not.

Do these practices work for non-writing tasks?

Yes. Any task where you can define checkable quality, react to a draft, and revise benefits from the same loop: standard first, probe, specific critique, one change, stop on the standard. Code, plans, and analyses all fit. The medium changes; the discipline does not.

How do I get a team to follow these consistently?

Standardize the standard format and the one-change rule, and make the running pass/fail picture a shared artifact. When the loop has the same shape for everyone, the practices become the default rather than something each person remembers individually. Consistency comes from shared structure, not willpower.

Key Takeaways

  • Define a concrete, checkable standard before generating; it is the finish line and the foundation.
  • Treat the first draft as a probe to react to, not a prompt to perfect.
  • Critique to a location and a direction, and change exactly one thing per iteration.
  • Use the model as a critic for candidate issues but keep the verdict and standard human.
  • Steer instead of re-rolling, stop on the standard, and track a running pass/fail picture.

Search Articles

Categories

OperationsSalesDeliveryGovernance

Popular Tags

prompt engineeringai fundamentalsai toolsthe difference between AIMLagency operationsagency growthenterprise sales

Share Article

A

Agency Script Editorial

Editorial Team

The Agency Script editorial team delivers operational insights on AI delivery, certification, and governance for modern agency operators.

Related Articles

General

Prompt Quality Decides Whether AI Earns Its Keep

Prompt quality is the single biggest variable in whether AI delivers real work or expensive noise. The model matters, the platform matters — but the prompt you write determines whether you get a first

A
Agency Script Editorial
June 1, 2026·10 min read
General

Counting the Real Cost of Every Token You Send

Tokens and context windows sit at the intersection of AI capability and operational cost—yet most business cases treat them as technical footnotes. That's a mistake that costs real money. Every time y

A
Agency Script Editorial
June 1, 2026·10 min read
General

Rolling Out AI Hallucinations Across a Team

Most teams discover AI hallucinations the hard way — a confident-sounding wrong answer makes it into a client deliverable, a legal brief, or a published report. The damage isn't just to the output; it

A
Agency Script Editorial
June 1, 2026·11 min read

Ready to certify your AI capability?

Join the professionals building governed, repeatable AI delivery systems.

Explore Certification