Entities, Relations, and Triples: Graph Extraction From Scratch

If you have heard the phrase "knowledge graph extraction" and felt it was reserved for data scientists with specialized tooling, this article is for you. The core idea is simpler than the jargon suggests, and a capable language model plus a thoughtful prompt now puts it within reach of anyone willing to learn a few concepts. By the end of this piece you will understand what a knowledge graph is, what extraction means, and how to write a first prompt that pulls structured facts out of plain text.

We will assume no prior background. Every term gets defined when it first appears. The aim is not to make you an expert in one reading but to give you a solid mental model and enough confidence to run your own small experiment today.

Knowledge graphs power a lot of things you already use—search results that show facts about a person, recommendation systems, and question-answering tools. Learning to build one from text is a genuinely useful skill, and the entry point has never been lower.

What Is a Knowledge Graph

A Web of Facts, Not a Pile of Text

Imagine a corkboard where index cards hold names—people, companies, products—and strings connect cards that relate to each other. A string from "Steve Jobs" to "Apple" labeled "founded" captures a fact. A knowledge graph is that corkboard, scaled up and made searchable. Each card is an entity; each string is a relationship.

The power comes from connection. Once facts are linked, you can ask questions that span many of them: "Which companies were founded by people who previously worked at this firm?" That question is hard to answer by reading text but easy to answer by following links in a graph.

Entities and Relationships in Plain Terms

An entity is a thing the graph knows about: a person, a place, an organization, a product. A relationship is how two entities connect: founded, employs, located_in, treats. That is the whole vocabulary you need to start. Everything else builds on these two ideas.

What Extraction Means

Reading Text and Pulling Out Facts

Extraction is the act of reading unstructured text—a news article, a contract, a research paper—and pulling out the entities and relationships hidden inside it. A human does this naturally while reading. Knowledge graph extraction asks a computer to do it systematically and at scale.

The Triple: The Unit of Extracted Knowledge

The output of extraction is a list of triples. A triple has three parts: a subject, a predicate, and an object. "Marie Curie won the Nobel Prize" becomes the triple (Marie Curie, won, Nobel Prize). The subject and object are entities; the predicate is the relationship. Once you see knowledge as triples, extraction becomes concrete: read text, produce triples.

Why Prompts Drive Extraction

Telling the Model Exactly What You Want

A language model can read text and understand it, but it does not know which facts you care about or what format you need unless you tell it. The prompt is that instruction. A vague prompt gets vague results; a precise prompt gets structured, usable triples. Learning to write good prompts is the heart of the skill, and the broader discipline is covered in Turning Unstructured Text Into Connected Entity Graphs.

The Role of a Schema

A schema is a short list of the entity types and relationship types you allow. Think of it as the legend on a map. By telling the model "only use these entity types and these relationships," you keep its output consistent. Without a schema, the model labels the same relationship five different ways and your graph turns to mush.

Writing Your First Extraction Prompt

Start With a Tiny Schema

Pick something small. Say you want to extract facts about companies and their founders. Your entity types are Person and Organization. Your relationship type is founded. That is a complete, valid schema. Starting tiny keeps your first attempt understandable.

Specify the Output Format

Tell the model to return its answer as a simple list, where each line is one triple in the form subject, predicate, object. Ask for nothing else—no explanation, no commentary. A clean format means you can read the results easily and, later, feed them into software.

Add a Grounding Rule

Include one sentence: "Only extract facts that are actually stated in the text. If a fact is not clearly there, leave it out." This single instruction stops the model from inventing relationships that sound plausible but are not in the source. The reasons this matters so much are unpacked in Why Graph Extraction Prompts Silently Drop Half Your Entities.

Trying It on Real Text

Feed In a Short Paragraph

Take a couple of sentences from a news story about a startup. Paste your schema, your format instruction, your grounding rule, and the paragraph into the model. Read what comes back. You should see one or two clean triples that match facts in the text.

Check the Output Against the Source

For each triple the model produced, find the sentence that supports it. If you cannot find one, the model either misread or invented it. This habit of checking against the source is the foundation of trustworthy extraction, and it scales into the verification routine in Ship-Ready Verification Steps for Graph Extraction Prompts.

Common Beginner Stumbles

Asking for Too Much at Once

Beginners often hand the model a huge schema and a long document and get messy results. Start narrow. Master extracting one relationship type from short text before expanding. Confidence comes from small wins.

Forgetting to Constrain Relationships

If you do not list the exact relationships you want, the model improvises. You will get "founded," "started," "created," and "established" for what should be one relationship. Always enumerate your relationship types explicitly. For a sequenced walkthrough of doing this right, see Walk Text Through a Triple-Producing Extraction Pipeline.

Trusting Output Without Checking

The model is confident even when wrong. Always verify a sample of triples against the source until you trust your prompt. Verification is not a sign of distrust; it is how every serious extraction workflow operates.

Frequently Asked Questions

Do I need to know how to code to try extraction?

Not to start. You can paste a prompt and a paragraph into a chat interface and read the triples it returns by hand. Coding becomes useful when you want to process many documents automatically and load triples into graph software, but the core skill is prompt writing.

What is the difference between an entity and a relationship?

An entity is a thing—a person, company, place, or product. A relationship is how two entities connect, like founded or employs. In the triple (Apple, foundedby, Steve Jobs), Apple and Steve Jobs are entities and foundedby is the relationship.

Why do I need a schema if the model is smart?

The model is capable but not mind-reading. Without a schema it guesses which facts matter and invents inconsistent labels. A schema tells it exactly which entity and relationship types to use, which keeps output consistent enough to assemble into a real graph.

What does grounding mean in this context?

Grounding means every extracted fact is actually supported by the source text rather than inferred or invented. A grounding rule instructs the model to extract only stated facts and skip anything it cannot find in the text, which prevents fabricated relationships.

How much text should I start with?

A few sentences. Starting small lets you read every triple and check it against the source, which builds your intuition for what good output looks like. Once a short prompt works reliably, scale up to paragraphs and then documents.

What can I build once I have extracted triples?

A list of triples can be loaded into graph software to answer connected questions, power a search feature, or feed a recommendation system. Even before any software, a clean set of triples is a structured summary of facts you can query and reason over.

Key Takeaways

A knowledge graph is a web of facts: entities (things) connected by relationships, like a corkboard of index cards joined by labeled strings.
Extraction means reading plain text and pulling out structured facts, expressed as triples of subject, predicate, and object.
The prompt is what tells the model which facts you want and in what format, so prompt quality drives extraction quality.
A schema—a short list of allowed entity and relationship types—keeps the model's output consistent enough to assemble.
Always include a grounding rule so the model extracts only facts stated in the text, and always check output against the source.
Start tiny: one relationship, a few sentences, careful verification. Confidence and complexity grow from there.

What Is a Knowledge Graph

A Web of Facts, Not a Pile of Text

Entities and Relationships in Plain Terms

What Extraction Means

Reading Text and Pulling Out Facts

The Triple: The Unit of Extracted Knowledge

Why Prompts Drive Extraction

Telling the Model Exactly What You Want

The Role of a Schema

Writing Your First Extraction Prompt

Start With a Tiny Schema

Specify the Output Format

Add a Grounding Rule

Trying It on Real Text

Feed In a Short Paragraph

Check the Output Against the Source

Common Beginner Stumbles

Asking for Too Much at Once

Forgetting to Constrain Relationships

Trusting Output Without Checking

Frequently Asked Questions

Do I need to know how to code to try extraction?

What is the difference between an entity and a relationship?

Why do I need a schema if the model is smart?

What does grounding mean in this context?

How much text should I start with?

What can I build once I have extracted triples?

Key Takeaways

A knowledge graph is a web of facts: entities (things) connected by relationships, like a corkboard of index cards joined by labeled strings.
Extraction means reading plain text and pulling out structured facts, expressed as triples of subject, predicate, and object.
The prompt is what tells the model which facts you want and in what format, so prompt quality drives extraction quality.
A schema—a short list of allowed entity and relationship types—keeps the model's output consistent enough to assemble.
Always include a grounding rule so the model extracts only facts stated in the text, and always check output against the source.
Start tiny: one relationship, a few sentences, careful verification. Confidence and complexity grow from there.

Entities, Relations, and Triples: Graph Extraction From Scratch

What Is a Knowledge Graph

A Web of Facts, Not a Pile of Text

Entities and Relationships in Plain Terms

What Extraction Means

Reading Text and Pulling Out Facts

The Triple: The Unit of Extracted Knowledge

Why Prompts Drive Extraction

Telling the Model Exactly What You Want

The Role of a Schema

Writing Your First Extraction Prompt

Start With a Tiny Schema

Specify the Output Format

Add a Grounding Rule

Trying It on Real Text

Feed In a Short Paragraph

Check the Output Against the Source

Common Beginner Stumbles

Asking for Too Much at Once

Forgetting to Constrain Relationships

Trusting Output Without Checking

Frequently Asked Questions

Do I need to know how to code to try extraction?

What is the difference between an entity and a relationship?

Why do I need a schema if the model is smart?

What does grounding mean in this context?

How much text should I start with?

What can I build once I have extracted triples?

Key Takeaways

Agency Script Editorial

Related Articles

Prompt Quality Decides Whether AI Earns Its Keep

Counting the Real Cost of Every Token You Send

Rolling Out AI Hallucinations Across a Team

Ready to certify your AI capability?

Entities, Relations, and Triples: Graph Extraction From Scratch

What Is a Knowledge Graph

A Web of Facts, Not a Pile of Text

Entities and Relationships in Plain Terms

What Extraction Means

Reading Text and Pulling Out Facts

The Triple: The Unit of Extracted Knowledge

Why Prompts Drive Extraction

Telling the Model Exactly What You Want

The Role of a Schema

Writing Your First Extraction Prompt

Start With a Tiny Schema

Specify the Output Format

Add a Grounding Rule

Trying It on Real Text

Feed In a Short Paragraph

Check the Output Against the Source

Common Beginner Stumbles

Asking for Too Much at Once

Forgetting to Constrain Relationships

Trusting Output Without Checking

Frequently Asked Questions

Do I need to know how to code to try extraction?

What is the difference between an entity and a relationship?

Why do I need a schema if the model is smart?

What does grounding mean in this context?

How much text should I start with?

What can I build once I have extracted triples?

Key Takeaways

Agency Script Editorial

Related Articles

Prompt Quality Decides Whether AI Earns Its Keep

Counting the Real Cost of Every Token You Send

Rolling Out AI Hallucinations Across a Team

Ready to certify your AI capability?