AGENCYSCRIPT
CoursesEnterpriseBlog
đź‘‘FoundersSign inJoin Waitlist
AGENCYSCRIPT

Governed Certification Framework

The operating system for AI-enabled agency building. Certify judgment under constraint. Standards over scale. Governance over shortcuts.

Stay informed

Governance updates, certification insights, and industry standards.

Products

  • Platform
  • Certification
  • Launch Program
  • Vault
  • The Book

Certification

  • Foundation (AS-F)
  • Operator (AS-O)
  • Architect (AS-A)
  • Principal (AS-P)

Resources

  • Blog
  • Verify Credential
  • Enterprise
  • Partners
  • Pricing

Company

  • About
  • Contact
  • Careers
  • Press
© 2026 Agency Script, Inc.·
Privacy PolicyTerms of ServiceCertification AgreementSecurity

Standards over scale. Judgment over volume. Governance over shortcuts.

On This Page

Before You PromptSetup ChecklistWhile GeneratingGeneration ChecklistWhile RefiningRefinement ChecklistWhile PrioritizingPrioritization ChecklistAfter the SessionCloseout ChecklistAdapting the Checklist to Your ContextScaling by StakesTurning the Checklist Into Team HabitCommon Ways the Checklist Gets MisusedPitfalls to AvoidFrequently Asked QuestionsDo I need to use every item every time?Why is logging hypotheses on the checklist?What is the single most skipped item?How is this different from a generic brainstorming checklist?Can I turn this into a prompt template?Key Takeaways
Home/Blog/Pre-Flight Items to Run Before a Hypothesis Session
General

Pre-Flight Items to Run Before a Hypothesis Session

A

Agency Script Editorial

Editorial Team

·January 2, 2021·7 min read
prompting for hypothesis generationprompting for hypothesis generation checklistprompting for hypothesis generation guideprompt engineering

A checklist earns its place when it prevents the mistakes you reliably make under pressure. Hypothesis generation with AI has a handful of those: thin context, premature focus, missing the boring cause, no test path. This checklist is built to catch each one before it costs you a session.

Use it as a working tool rather than reading material. Keep it open while you run a session, and tick items as you go. Each entry includes a brief justification so you understand why it matters and can adapt it to your own context rather than following it blindly.

Before You Prompt

Preparation determines most of your output quality. Do not skip to prompting.

Setup Checklist

  • Write a one-paragraph problem statement. A precise statement is the foundation; everything downstream depends on it.
  • Include real numbers and dates. Specifics let the model tailor hypotheses to your situation instead of producing generic ideas.
  • List recent changes on your side. Launches, deploys, pricing moves, and campaigns are prime suspects and should be in front of the model.
  • Note what you have already ruled out. This stops the model from spending candidates on dead ends.
  • Define what a solved problem looks like. Knowing the goal keeps the session aimed at actionable hypotheses.
  • State what you would do with each answer. If a confirmed hypothesis would not change any decision, it may not be worth generating; this keeps the session tied to action.

This preparation mirrors the framing step in A Sequential Process for Drafting Testable Ideas With AI. The few minutes it takes to assemble these items consistently outperforms a faster start, because every weakness in the setup compounds through the rest of the session.

While Generating

The generation pass is about breadth and diversity, not judgment.

Generation Checklist

  • Ask for at least fifteen hypotheses. The non-obvious, useful ideas sit deep in the list, past the predictable first few.
  • Request explanations across named categories. Categories like measurement, behavior, technical, and external prevent the model from repeating one theme.
  • Explicitly invite uncomfortable hypotheses. Inviting bad-news explanations counters the bias that hides true causes implicating your own decisions.
  • Include a null hypothesis. Considering that the effect is noise or an artifact guards against chasing a pattern that is not real.
  • Do not evaluate yet. Judging during generation kills promising lines and biases toward the obvious.

The reasoning behind these items is laid out in Opinionated Habits That Make Hypothesis Prompts Pay Off.

While Refining

Once you have a wide list, sharpen it into something testable.

Refinement Checklist

  • Rewrite each kept hypothesis with its mechanism. A causal chain, not a bare claim, is what makes a hypothesis testable.
  • Attach a test method to every hypothesis. If you cannot name a way to check it, it is not yet actionable.
  • Flag any hypothesis you cannot test. Either reframe it into something measurable or set it aside honestly.
  • Strip duplicates. Near-identical hypotheses inflate the list without adding options.

While Prioritizing

You cannot test everything. Prioritization is where your judgment leads.

Prioritization Checklist

  • Score each hypothesis on impact if true. High-impact hypotheses deserve attention even if they are less likely.
  • Score each on cost to test. Cheap, fast checks let you learn quickly and eliminate options.
  • Test cheap, decisive hypotheses first. Removing candidates for almost no cost narrows the field efficiently.
  • Pick three to start. A short list keeps the investigation focused; you can always return to the rest.

This scoring approach is detailed in Weighing the Competing Ways to Prompt for Hypotheses.

After the Session

The work does not end when you have a shortlist. Capture it.

Closeout Checklist

  • Log every hypothesis and its status. A record prevents you from regenerating ideas you already resolved.
  • Note the evidence that moved each one. Preserving reasoning turns scattered sessions into institutional memory.
  • Schedule the first test. A hypothesis with no test date tends to drift; commit to a check.
  • Plan to regenerate after results. New evidence reshapes the hypothesis space, so the next session starts from what you learned.

Adapting the Checklist to Your Context

A checklist is only useful if it fits the work in front of you. The version above is the full, high-stakes form, and you should expect to trim it for everyday use rather than treat every item as mandatory.

Scaling by Stakes

The right amount of rigor scales with how costly a wrong conclusion would be. Use these rough tiers as a guide:

  • Quick, low-stakes questions: Run the setup and generation items only. A precise problem statement and a breadth prompt with a null hypothesis are usually enough. Skip formal scoring and logging.
  • Recurring operational problems: Add the refinement and prioritization items so you produce testable, ranked hypotheses. Keep a lightweight log because the same problems tend to recur.
  • High-stakes investigations: Run every item deliberately and document each stage. When a wrong conclusion is expensive, the few minutes each item costs is trivial insurance.

The skill is not memorizing the list; it is knowing which items to keep when time is short. Over a few sessions you will develop an instinct for which checks catch your particular mistakes most often, and you can promote those to non-negotiable.

Turning the Checklist Into Team Habit

When more than one person runs hypothesis sessions, an informal checklist drifts. Different people skip different items, and the quality of sessions becomes uneven. Codifying the list, even as a shared document or a prompt template, makes the standard explicit. A team that agrees on the same setup and generation items produces comparable hypotheses and avoids the situation where one person's session is rigorous and another's is a single vague prompt. This shared standard is what makes the case-study style turnaround in How a Stalled Trial Funnel Got Diagnosed by AI Prompts repeatable rather than lucky.

Common Ways the Checklist Gets Misused

A checklist can fail even when followed, usually because it is treated as a box-ticking ritual rather than a thinking aid. Watch for a few patterns that drain its value.

Pitfalls to Avoid

  • Ticking without engaging. Marking an item done because you technically did it, while producing a vague problem statement, defeats the purpose. The items are prompts to think, not formalities.
  • Treating every item as mandatory. Forcing the full high-stakes list onto a trivial question wastes time and breeds resentment for the checklist itself. Scale it to the stakes.
  • Never updating it. Your most common mistakes are personal. If you keep skipping the null hypothesis, promote it to a bold, non-negotiable item. A static checklist that ignores your actual failure pattern is less useful than one you tune.
  • Using it only at the start. The closeout items, logging and scheduling the first test, are where many sessions quietly fail. Run the checklist through to the end, not just the setup.

The goal is a living tool that catches your real mistakes, not a compliance document. When it stops catching anything, revise it. The mistakes it is meant to prevent are catalogued in Seven Ways Hypothesis Prompts Quietly Go Wrong.

Frequently Asked Questions

Do I need to use every item every time?

No. For a quick, low-stakes question you might use only the setup and generation items. The full checklist is for problems that matter enough to investigate carefully. Adapt the depth to the stakes.

Why is logging hypotheses on the checklist?

Because without a log, teams regenerate and re-debate the same ideas across sessions, wasting effort and losing the reasoning behind past decisions. A simple log turns isolated sessions into a growing, searchable knowledge base.

What is the single most skipped item?

Including a null hypothesis. People are eager to explain a surprising result and forget to ask whether the result is even real. Skipping it leads to investigations built on noise or measurement artifacts.

How is this different from a generic brainstorming checklist?

It is built around the specific failure modes of AI-assisted hypothesis work: model overconfidence, repetition, bias toward obvious causes, and untestable ideas. Generic brainstorming checklists do not address those, because they assume a human source of ideas.

Can I turn this into a prompt template?

Yes, and many people do. You can encode the setup, generation, and refinement items into a reusable prompt structure. Just keep prioritization and logging as human steps, since those depend on your judgment and your records.

Key Takeaways

  • Preparation, especially a specific problem statement with real numbers, drives most of the output quality.
  • During generation, aim for breadth, force categories, invite uncomfortable ideas, and always include a null hypothesis.
  • Refine by attaching a mechanism and a test method to every kept hypothesis.
  • Prioritize by impact and cost to test, then start with three cheap, decisive checks.
  • Close every session by logging hypotheses, recording evidence, and scheduling the first test.

Search Articles

Categories

OperationsSalesDeliveryGovernance

Popular Tags

prompt engineeringai fundamentalsai toolsthe difference between AIMLagency operationsagency growthenterprise sales

Share Article

A

Agency Script Editorial

Editorial Team

The Agency Script editorial team delivers operational insights on AI delivery, certification, and governance for modern agency operators.

Related Articles

General

Prompt Quality Decides Whether AI Earns Its Keep

Prompt quality is the single biggest variable in whether AI delivers real work or expensive noise. The model matters, the platform matters — but the prompt you write determines whether you get a first

A
Agency Script Editorial
June 1, 2026·10 min read
General

Counting the Real Cost of Every Token You Send

Tokens and context windows sit at the intersection of AI capability and operational cost—yet most business cases treat them as technical footnotes. That's a mistake that costs real money. Every time y

A
Agency Script Editorial
June 1, 2026·10 min read
General

Rolling Out AI Hallucinations Across a Team

Most teams discover AI hallucinations the hard way — a confident-sounding wrong answer makes it into a client deliverable, a legal brief, or a published report. The damage isn't just to the output; it

A
Agency Script Editorial
June 1, 2026·11 min read

Ready to certify your AI capability?

Join the professionals building governed, repeatable AI delivery systems.

Explore Certification