Teams rarely fail at prompt libraries because they cannot write good prompts. They fail because they have no model for how a prompt moves from a clever one-off to a dependable, shared asset. Without that model, every prompt lives in a different stage of maturity and nobody can tell which ones are safe to depend on.
This article introduces a named structure called CRAFT: Capture, Refine, Annotate, File, and Track. It is deliberately small. Five stages are enough to describe the full life of a reusable prompt without turning the library into a process-heavy burden. Each stage answers a specific question, and each has a clear signal that tells you when to move on.
The point of a named model is not ceremony. It is shared language. When someone says a prompt is "captured but not refined," everyone knows exactly what that means and what is missing. The rest of this article walks through each stage and, just as importantly, tells you which stage to focus on given where your team is starting from.
Stage 1: Capture
Capture is about not losing good prompts. Most valuable prompts are written in the middle of solving a real problem and discarded the moment the work is done.
What capture looks like
The bar is intentionally low: a prompt and a one-line note on what it was for, saved somewhere central within the same work session. No formatting, no review, no perfection. The only sin at this stage is loss.
When to move on
Move past capture when you have a steady inflow of raw prompts and your problem has shifted from "we lose good prompts" to "we have a pile of good prompts nobody can use." If capture is your bottleneck, fix it before anything else, because the later stages have nothing to operate on without it.
Stage 2: Refine
Refine turns a working-for-me prompt into a working-for-anyone prompt. This is where most quality is won or lost.
What refine looks like
You parameterize the prompt, replacing hardcoded specifics with named variables. You strip out anything incidental to the original task. You test it against a few inputs beyond the one that prompted it. The output should be the question "does this generalize?" being answered yes.
When to move on
A prompt is refined when it produces good output on inputs the original author never saw. If you skip refinement, you ship brittle prompts that work once and confuse everyone after, which is the core problem behind most failed reuse.
Stage 3: Annotate
Annotate makes the prompt self-describing so a stranger can use it correctly without asking the author.
What annotate looks like
Add a name that describes the job, a purpose sentence, the intended model, the required inputs, and at least one example of correct output. This is metadata, not prose. A reader should be able to decide in seconds whether this prompt fits their need.
When to move on
Annotation is complete when a teammate can pick up the prompt cold and use it without a conversation. Under-annotated libraries technically contain reusable prompts but functionally do not, because the cost of figuring out how to use each one exceeds the cost of writing a new one.
Stage 4: File
File places the prompt in your single source of truth, correctly categorized and tagged so it can be found.
What file looks like
Put the prompt where your team already looks, apply tags from a controlled vocabulary, and remove or supersede any older copies. Filing is the stage that converts a personal asset into a shared one.
When to move on
Filing is done when search returns the prompt for the queries people actually use. The failure mode here is a "library" that is really a graveyard: prompts are stored but undiscoverable, so people rewrite from scratch. Where to file is itself a structural decision, weighed in Prompt Libraries and Reuse: Trade-offs, Options, and How to Decide.
Stage 5: Track
Track keeps the prompt trustworthy over time. Models change, requirements change, and a prompt that was correct last quarter may be wrong now.
What track looks like
Version the prompt, keep a one-line changelog per version, and re-test it on a schedule and after every model upgrade. Tracking is what separates a library you can depend on from a library you have to re-verify by hand every time.
When to move on
You never fully leave tracking; it is the steady state of a mature library. The signal that tracking is working is that you can trust a prompt without re-reading it, because its version history tells you it has been maintained.
Why tracking is the stage that gets cut first
Tracking is unglamorous and easy to defer, which is exactly why it is the stage teams abandon under deadline pressure. The damage is delayed rather than immediate, so it feels safe to skip, and then a model upgrade arrives and a dozen untracked prompts quietly degrade at once. Building tracking as a lightweight habit while the library is small is far cheaper than retrofitting it onto a large one in crisis.
Choosing Where to Invest
Map your team to a stage
Most teams are strongest at the stages closest to writing prompts (Capture, Refine) and weakest at the ones closest to maintenance (File, Track). Find the earliest stage where prompts pile up unresolved; that is your bottleneck, and investing anywhere else is premature.
Do not build the whole model at once
CRAFT describes a complete lifecycle, but you should adopt it incrementally, fixing one stage's bottleneck before moving to the next. A team drowning in lost prompts gains nothing from a sophisticated versioning system.
Let intent drive depth
High-stakes prompts deserve the full model; throwaway experiments may stop at Capture. The framework scales down as honestly as it scales up. For a concrete starting sequence, see Getting Started with Prompt Libraries and Reuse.
Use the model as shared diagnostic language
The most underrated benefit of naming the stages is that it gives your team a precise vocabulary for disagreement. When two people argue about whether a prompt is "ready," they are usually arguing about different stages without realizing it: one means refined and the other means tracked. Naming the stages turns a vague dispute into a specific, resolvable question about which stage is actually incomplete, which is worth more than any single practice the model prescribes.
Frequently Asked Questions
How is CRAFT different from just having good prompt-writing habits?
Good prompt-writing habits get you through Capture and Refine. CRAFT adds the often-neglected back half (Annotate, File, Track) that turns a personal skill into a team asset. The model's value is precisely in naming the stages people skip, because those are the ones that make reuse work or fail.
Do small teams really need a five-stage model?
A small team can run all five stages lightly, sometimes in minutes per prompt. The model does not prescribe heavy process; it prescribes a sequence of questions. Even a two-person team benefits from agreeing that a prompt is "captured but not yet annotated," because that shared language prevents the assumption that everything stored is ready to use.
Where does prompt testing fit in the model?
Testing appears twice: lightly in Refine (does this generalize?) and formally in Track (does this still work after changes?). Treating these as separate moments matters, because the first is about reaching reusability and the second is about preserving it. The detailed practices live in Prompt Libraries and Reuse: Best Practices That Actually Work.
What happens if we skip the File stage?
Skipping File is the single most common way a library quietly dies. Prompts get captured, refined, and annotated, then saved somewhere idiosyncratic where nobody finds them. The work of the earlier stages is wasted because discovery never happens, and people default to rewriting prompts they already own.
Key Takeaways
- CRAFT names five stages (Capture, Refine, Annotate, File, Track) so teams share precise language about how mature any given prompt is.
- Capture prevents loss, Refine creates generality, Annotate enables strangers to reuse, File enables discovery, and Track preserves trust over time.
- Find your earliest unresolved bottleneck stage and invest there; building later stages first is premature.
- The back half of the model (File and Track) is the most skipped and the most decisive for whether reuse actually happens.
- The framework scales down honestly: throwaway prompts can stop at Capture while high-stakes ones earn the full lifecycle.
- Tracking is a steady state, not a one-time step, because models and requirements keep changing under your prompts.