AGENCYSCRIPT
CoursesEnterpriseBlog
đź‘‘FoundersSign inJoin Waitlist
AGENCYSCRIPT

Governed Certification Framework

The operating system for AI-enabled agency building. Certify judgment under constraint. Standards over scale. Governance over shortcuts.

Stay informed

Governance updates, certification insights, and industry standards.

Products

  • Platform
  • Certification
  • Launch Program
  • Vault
  • The Book

Certification

  • Foundation (AS-F)
  • Operator (AS-O)
  • Architect (AS-A)
  • Principal (AS-P)

Resources

  • Blog
  • Verify Credential
  • Enterprise
  • Partners
  • Pricing

Company

  • About
  • Contact
  • Careers
  • Press
© 2026 Agency Script, Inc.·
Privacy PolicyTerms of ServiceCertification AgreementSecurity

Standards over scale. Judgment over volume. Governance over shortcuts.

On This Page

Play 1: Pin the SchemaTriggerOwner and movesPlay 2: Prove It on a SampleTriggerOwner and movesPlay 3: Split Entities From RelationshipsTriggerOwner and movesPlay 4: Resolve Entities Against a CanonTriggerOwner and movesPlay 5: Verify Against the SourceTriggerOwner and movesPlay 6: Route by DifficultyTriggerOwner and movesPlay 7: Measure Against a Gold SetTriggerOwner and movesPlay 8: Keep the Graph FreshTriggerOwner and movesPlay 9: Contain AmbiguityTriggerOwner and movesPlay 10: Document the Pipeline for HandoffTriggerOwner and movesFrequently Asked QuestionsIn what order should I run these plays?Can one person own all of these plays?Which play has the highest payoff?How often should I rerun the sample play?Do I need all eight plays for a small project?Key Takeaways
Home/Blog/Pulling Clean Graphs From Messy Source Text
General

Pulling Clean Graphs From Messy Source Text

A

Agency Script Editorial

Editorial Team

·October 25, 2019·8 min read
prompting for knowledge graph extractionprompting for knowledge graph extraction playbookprompting for knowledge graph extraction guideprompt engineering

A playbook is different from a tutorial. A tutorial walks you through a task once, in ideal conditions. A playbook gives you a set of named plays, each with a clear trigger for when to run it, the person responsible, and its place in the larger sequence. When something goes sideways in production, you do not want to reason from first principles. You want to recognize the situation and reach for the play that handles it.

This article lays out the plays that matter for prompting a language model to extract a knowledge graph. They are ordered roughly the way you would run them in a project: define the target, prove the approach on a sample, harden it for volume, and keep it honest over time. Each play stands alone, so you can run only the ones your situation calls for.

Throughout, "owner" refers to a role rather than a person, because the same individual often wears several hats on a small team. The point is that every play has a single accountable role, so nothing falls through the cracks.

Play 1: Pin the Schema

Trigger

Run this first, before writing any extraction prompt, and rerun it whenever the downstream graph reveals a relationship type nobody anticipated.

Owner and moves

The domain owner defines a closed list of entity types and relationship types, each with a one-line definition and a worked example drawn from real text. Ambiguous cases get explicit rules: when two entities co-occur in a sentence, which relationship applies, and when do you emit none. The output is a written schema specification that the prompt will reference directly. Without this play, every later play produces inconsistent results you cannot aggregate.

Play 2: Prove It on a Sample

Trigger

Immediately after the schema exists, before any investment in scale.

Owner and moves

The prompt owner builds an extraction prompt that embeds the schema, includes three to five diverse examples covering different relationship types, and requests structured output with a source span for each triple. Run it against twenty representative documents and read every result by hand. You are checking two things: does the format hold, and does the model interpret the schema the way you intended. Disagreements here are cheap to fix and expensive to ignore later.

Play 3: Split Entities From Relationships

Trigger

When documents are long, when relationships span distant paragraphs, or when recall in the sample run was disappointing.

Owner and moves

The pipeline owner decomposes extraction into two passes. The first pass extracts and canonicalizes entities, producing a resolved entity list per document. The second pass takes that list and asks only about relationships among the known entities. This recovers long-range relationships a single pass misses and makes each stage independently testable. The deeper rationale for decomposition lives in What People Get Wrong About Pulling Graphs From Text.

Play 4: Resolve Entities Against a Canon

Trigger

The moment you see duplicate nodes for the same real-world entity in the sample graph.

Owner and moves

The pipeline owner maintains a canonical entity list and passes it into extraction so the model extends it rather than inventing new nodes for each spelling variant. New mentions are matched against the canon; genuinely new entities get added with a canonical form. Doing this during extraction, not after, is what keeps the graph from fragmenting into edges that point at half a dozen versions of the same company.

Play 5: Verify Against the Source

Trigger

Before any extracted triple is allowed into the production graph.

Owner and moves

The validation owner runs every triple through two checks. Structural validation confirms the output parses and conforms to the schema. Content validation confirms the cited source span actually exists in the document and that the entities appear in it. Triples whose confidence falls below a threshold route to human review instead of the graph. This play is where fabricated relationships get caught, and it is the one teams most often skip because the JSON "looks fine."

Play 6: Route by Difficulty

Trigger

When extraction cost at volume is higher than the value justifies.

Owner and moves

The pipeline owner adds a cheap classifier that sorts incoming documents by length and complexity. Short, simple documents take the single-pass path; long or dense ones take the decomposed multi-pass path. This keeps average cost low without sacrificing quality on the hard documents. It is the difference between paying premium rates on every document and paying them only where they matter.

Play 7: Measure Against a Gold Set

Trigger

Continuously, on every prompt or model change.

Owner and moves

The evaluation owner maintains a held-out set of documents with hand-labeled triples and reports precision and recall on every change. Precision guards against fabrication; recall guards against silent drops. Watching both prevents the common failure where tightening the prompt to reduce hallucination quietly halves recall. The discipline mirrors the evaluation thinking in Controlling Formality and Register in Output: Best Practices That Actually Work, where surface compliance can hide real regressions.

Play 8: Keep the Graph Fresh

Trigger

Whenever a source document changes or is replaced.

Owner and moves

The pipeline owner tracks provenance at the triple level: every relationship records which document and version it came from. When a document changes, re-extract it, reconcile the new triples against the old, and retire any relationship that no longer has source support. Triple-level provenance is what makes incremental updates possible without rebuilding the entire graph from scratch.

Play 9: Contain Ambiguity

Trigger

When repeated runs of the same document disagree on a relationship, or when reviewers flag passages that read multiple ways.

Owner and moves

The validation owner treats run-to-run disagreement as a signal rather than noise. Rather than forcing a deterministic answer, the pipeline records the competing interpretations, attaches their source spans, and routes the passage to human judgment. The reviewer decides which reading is correct, or records both with provenance if the source is genuinely ambiguous. This play prevents the false confidence that comes from collapsing real ambiguity into a single committed triple, a trap explored in Straight Answers on Turning Text Into Knowledge Graphs.

Play 10: Document the Pipeline for Handoff

Trigger

Before any team member who built the pipeline leaves it, and ideally from the start.

Owner and moves

The pipeline owner produces a short runbook per stage: its input, its output, how to run it, and how to recognize failure. Together with the schema document and the gold set, these runbooks form the handoff package that lets a newcomer operate the pipeline without decoding anyone's prompts. The play converts a pipeline that lives in one person's head into an asset the team owns, which is the difference between a fragile script and a durable system.

Frequently Asked Questions

In what order should I run these plays?

Roughly the order presented: pin the schema, prove it on a sample, then add decomposition, resolution, and validation as the sample reveals their need. Routing and freshness are operational plays you add once the pipeline runs at volume. Do not add complexity before the sample run shows it is warranted.

Can one person own all of these plays?

On a small team, yes. The roles exist to ensure single accountability per play, not to mandate headcount. What matters is that the schema, the prompt, validation, and evaluation each have a clear owner, even if that owner is the same person.

Which play has the highest payoff?

Pinning the schema. Every downstream play depends on a clear, closed vocabulary, and most extraction problems trace back to a schema that was never fully specified. Time spent here returns more than anywhere else in the sequence.

How often should I rerun the sample play?

Whenever the schema changes meaningfully, whenever you switch models, and whenever the production graph surprises you. The sample run is cheap and catches interpretation drift before it contaminates the full corpus.

Do I need all eight plays for a small project?

No. A small project with short, clean documents might run only the schema, sample, and validation plays. Add decomposition, resolution, routing, and freshness as your corpus grows in size and messiness. The plays are a menu, not a mandatory sequence.

Key Takeaways

  • Pin a closed schema first; nearly every extraction problem traces back to an underspecified vocabulary.
  • Prove the prompt on a hand-read sample before investing in scale, catching interpretation drift while it is cheap to fix.
  • Decompose long documents into entity and relationship passes, and resolve entities against a canon to prevent fragmentation.
  • Verify every triple against its source span before it enters the graph; structural validation alone does not catch fabrication.
  • Measure precision and recall on a gold set continuously, and track triple-level provenance so the graph can update incrementally.

Search Articles

Categories

OperationsSalesDeliveryGovernance

Popular Tags

prompt engineeringai fundamentalsai toolsthe difference between AIMLagency operationsagency growthenterprise sales

Share Article

A

Agency Script Editorial

Editorial Team

The Agency Script editorial team delivers operational insights on AI delivery, certification, and governance for modern agency operators.

Related Articles

General

Prompt Quality Decides Whether AI Earns Its Keep

Prompt quality is the single biggest variable in whether AI delivers real work or expensive noise. The model matters, the platform matters — but the prompt you write determines whether you get a first

A
Agency Script Editorial
June 1, 2026·10 min read
General

Counting the Real Cost of Every Token You Send

Tokens and context windows sit at the intersection of AI capability and operational cost—yet most business cases treat them as technical footnotes. That's a mistake that costs real money. Every time y

A
Agency Script Editorial
June 1, 2026·10 min read
General

Rolling Out AI Hallucinations Across a Team

Most teams discover AI hallucinations the hard way — a confident-sounding wrong answer makes it into a client deliverable, a legal brief, or a published report. The damage isn't just to the output; it

A
Agency Script Editorial
June 1, 2026·11 min read

Ready to certify your AI capability?

Join the professionals building governed, repeatable AI delivery systems.

Explore Certification