Transcript, Summary, or Slots: Deciding How Prompts Hold State

There is no single correct way to carry conversation state into a prompt, which is exactly why teams argue about it. Some pass the full transcript and trust the model to keep track. Others maintain a rolling summary. Others maintain structured slots. Each works in some situations and fails badly in others, and the failures are usually invisible until a conversation gets long.

This article lays out the three dominant approaches, the axes along which they differ, and a decision rule for picking one. The goal is not to crown a winner — it is to help you match the approach to the conversation you are actually building, because a choice that is perfect for a two-turn helper is wrong for a twenty-turn agent.

If you want the underlying vocabulary for these approaches first, A Reusable Model for Tracking Dialogue State in Prompts defines the render stage that all three implement differently.

It helps to see these three not as rival camps but as points on a spectrum from "let the model do the remembering" to "the system does the remembering and the model just acts on it." Full transcript sits at one end, structured state at the other, and rolling summary in between. Every real assistant lands somewhere on that line, and the right spot moves as the conversation and its stakes change.

The Three Approaches

Full transcript

Pass every turn of the conversation verbatim into each prompt. The model infers all state from reading the history.

This is the simplest to implement and the most natural for short conversations. It requires no capture and no structured state. Its weakness is that it scales poorly: as the transcript grows, the model misweights or misses facts, and token costs climb linearly.

Rolling summary

Periodically compress older turns into a summary, keeping recent turns verbatim. The prompt carries a summary plus a recent window.

This controls token growth and preserves long-range context. Its weakness is that summarization is lossy and can silently drop a fact that turns out to matter, and the summary itself can introduce errors the model then treats as ground truth.

Structured slots and state

Capture facts into named fields and inject the relevant subset each turn, as described in When a Prompt Forgets the User Already Paid: State Examples.

This is the most reliable for high-stakes, long conversations because state is explicit and constrainable. Its weakness is that it requires the most engineering: you must build capture, render, and reconcile rather than relying on the model.

The Axes That Separate Them

Reliability under length

Full transcript degrades as conversations grow. Summaries hold longer but risk silent loss. Structured state is most robust because the relevant facts are always explicit. If your conversations run long, this axis dominates.

Token cost

Full transcript is cheapest to build but most expensive to run at length. Summaries cut runtime cost. Structured state is leanest at runtime because you inject only what each turn needs.

Engineering effort

Full transcript is nearly free. Summaries require compression logic. Structured state requires the most upfront work but pays back in reliability and lower runtime cost, a trade-off quantified in Putting Numbers Behind Dialogue State Management in Prompts.

Constrainability

This axis is often overlooked. Only structured state lets you anchor negative constraints to specific fields — do not re-ask, do not re-present. With transcript or summary alone, constraints are vaguer and less reliable.

A Decision Rule

The choice falls out of a few questions about your conversation.

Work through these in order

Is the conversation short and low-stakes? Use full transcript. Anything more is over-engineering.
Is it long but tolerant of minor forgetting? Use a rolling summary. Accept the lossiness in exchange for simplicity and bounded cost.
Is it long and intolerant of contradiction or re-asking? Use structured state. The reliability and constrainability are worth the engineering.
Does it involve money, commitments, or compliance? Use structured state regardless of length, because the cost of a state error is high.

Hybrids are legitimate

Many production systems combine structured state for critical facts with a short recent transcript for conversational tone. This is not indecision; it is matching each kind of information to the representation that suits it. The case study in Rebuilding a Lapsing-Renewal Bot Around Explicit Turn State arrived at exactly this hybrid.

Combining Approaches Deliberately

The most capable production assistants rarely pick one approach and stop. They layer the three so each kind of information sits in the representation that suits it, which is more disciplined than it sounds.

A layered pattern that works

Structured state holds the facts that must be exact and constrainable — collected slots, decisions made, actions taken.
A short recent transcript preserves conversational tone and the last few exchanges so the model phrases responses naturally.
An optional rolling summary carries long-range soft context for very long conversations, never used for facts that must be precise.

Why layering beats purity

A purely structured approach can sound robotic because it strips conversational texture. A purely transcript-based approach drifts and grows costly. Layering takes the reliability of structured state and the natural feel of a recent transcript, paying only for the structure where exactness matters. The renewals account in Rebuilding a Lapsing-Renewal Bot Around Explicit Turn State landed on exactly this layering after starting with transcript alone, and the metrics in Reading the Signal: Metrics for Dialogue State in Prompts are what told them the layered version was actually better rather than just different.

Common Mistakes in Choosing

Defaulting to transcript forever

Teams start with full transcript because it is easy, then never revisit it as conversations grow longer and bugs multiply. The transcript approach has a shelf life.

Over-engineering a simple bot

The opposite error: building structured state with reconcile logic for a three-turn helper that the transcript would have handled fine. Match effort to stakes.

Treating a summary as ground truth

The third recurring mistake deserves its own warning: a rolling summary is the model's compressed interpretation, not authoritative state. When teams let summaries stand in for facts that needed to be exact — a price, a commitment, an account number — summarization errors propagate downstream and are nearly impossible to trace back. Keep anything that must be precise in structured state, and let the summary carry only the soft context where minor loss is tolerable.

Trusting a summary as ground truth

A rolling summary is the model's interpretation, not authoritative state. Treating it as fact lets summarization errors propagate. Critical facts belong in structured state.

Frequently Asked Questions

Which approach is the safest default?

For short, low-stakes conversations, full transcript. For anything long or high-stakes, structured state. There is no single default that fits all conversations.

Can I switch approaches later?

Yes, and many teams do — starting with transcript and migrating to structured state as conversations lengthen. Plan for it rather than treating the first choice as permanent.

Why not always use structured state if it is most reliable?

Because it costs the most to build. For simple conversations, that engineering is wasted. Reliability you do not need is not a benefit.

Is a rolling summary risky?

It can be, because compression is lossy and can drop a fact that later matters. Keep critical facts in structured state and use summaries for general context.

How does constrainability factor in?

Only structured state lets you forbid specific actions reliably. If your assistant must never re-ask or contradict, that alone can decide the choice.

What do most production agents end up using, and how do I know my current approach has stopped working?

Most converge on a hybrid: structured state for critical facts plus a short recent transcript for tone. You know your current approach has stopped working when re-asking, contradictions, and looping worsen as conversations lengthen — the classic signature of full transcript stretched too far.

Key Takeaways

Full transcript, rolling summary, and structured state are the three main ways to carry conversation state.
Full transcript is simplest but degrades and grows costly as conversations lengthen.
Rolling summaries bound cost but are lossy and can silently drop important facts.
Structured state is most reliable and constrainable but requires the most engineering.
Decide by conversation length, error tolerance, and stakes — and use structured state whenever money or commitments are involved.
Hybrids that combine structured state with a short recent transcript are a legitimate and common choice.

If you want the underlying vocabulary for these approaches first, A Reusable Model for Tracking Dialogue State in Prompts defines the render stage that all three implement differently.

The Three Approaches

Full transcript

Pass every turn of the conversation verbatim into each prompt. The model infers all state from reading the history.

Rolling summary

Periodically compress older turns into a summary, keeping recent turns verbatim. The prompt carries a summary plus a recent window.

Structured slots and state

Capture facts into named fields and inject the relevant subset each turn, as described in When a Prompt Forgets the User Already Paid: State Examples.

The Axes That Separate Them

Reliability under length

Token cost

Full transcript is cheapest to build but most expensive to run at length. Summaries cut runtime cost. Structured state is leanest at runtime because you inject only what each turn needs.

Engineering effort

Constrainability

A Decision Rule

The choice falls out of a few questions about your conversation.

Work through these in order

Is the conversation short and low-stakes? Use full transcript. Anything more is over-engineering.
Is it long but tolerant of minor forgetting? Use a rolling summary. Accept the lossiness in exchange for simplicity and bounded cost.
Is it long and intolerant of contradiction or re-asking? Use structured state. The reliability and constrainability are worth the engineering.
Does it involve money, commitments, or compliance? Use structured state regardless of length, because the cost of a state error is high.

Hybrids are legitimate

Combining Approaches Deliberately

A layered pattern that works

Structured state holds the facts that must be exact and constrainable — collected slots, decisions made, actions taken.
A short recent transcript preserves conversational tone and the last few exchanges so the model phrases responses naturally.
An optional rolling summary carries long-range soft context for very long conversations, never used for facts that must be precise.

Why layering beats purity

Common Mistakes in Choosing

Defaulting to transcript forever

Teams start with full transcript because it is easy, then never revisit it as conversations grow longer and bugs multiply. The transcript approach has a shelf life.

Over-engineering a simple bot

The opposite error: building structured state with reconcile logic for a three-turn helper that the transcript would have handled fine. Match effort to stakes.

Treating a summary as ground truth

Trusting a summary as ground truth

A rolling summary is the model's interpretation, not authoritative state. Treating it as fact lets summarization errors propagate. Critical facts belong in structured state.

Frequently Asked Questions

Which approach is the safest default?

For short, low-stakes conversations, full transcript. For anything long or high-stakes, structured state. There is no single default that fits all conversations.

Can I switch approaches later?

Yes, and many teams do — starting with transcript and migrating to structured state as conversations lengthen. Plan for it rather than treating the first choice as permanent.

Why not always use structured state if it is most reliable?

Because it costs the most to build. For simple conversations, that engineering is wasted. Reliability you do not need is not a benefit.

Is a rolling summary risky?

It can be, because compression is lossy and can drop a fact that later matters. Keep critical facts in structured state and use summaries for general context.

How does constrainability factor in?

Only structured state lets you forbid specific actions reliably. If your assistant must never re-ask or contradict, that alone can decide the choice.

What do most production agents end up using, and how do I know my current approach has stopped working?

Key Takeaways

Full transcript, rolling summary, and structured state are the three main ways to carry conversation state.
Full transcript is simplest but degrades and grows costly as conversations lengthen.
Rolling summaries bound cost but are lossy and can silently drop important facts.
Structured state is most reliable and constrainable but requires the most engineering.
Decide by conversation length, error tolerance, and stakes — and use structured state whenever money or commitments are involved.
Hybrids that combine structured state with a short recent transcript are a legitimate and common choice.

Transcript, Summary, or Slots: Deciding How Prompts Hold State

The Three Approaches

Full transcript

Rolling summary

Structured slots and state

The Axes That Separate Them

Reliability under length

Token cost

Engineering effort

Constrainability

A Decision Rule

Work through these in order

Hybrids are legitimate

Combining Approaches Deliberately

A layered pattern that works

Why layering beats purity

Common Mistakes in Choosing

Defaulting to transcript forever

Over-engineering a simple bot

Treating a summary as ground truth

Trusting a summary as ground truth

Frequently Asked Questions

Which approach is the safest default?

Can I switch approaches later?

Why not always use structured state if it is most reliable?

Is a rolling summary risky?

How does constrainability factor in?

What do most production agents end up using, and how do I know my current approach has stopped working?

Key Takeaways

Agency Script Editorial

Related Articles

Prompt Quality Decides Whether AI Earns Its Keep

Counting the Real Cost of Every Token You Send

Rolling Out AI Hallucinations Across a Team

Ready to certify your AI capability?

Transcript, Summary, or Slots: Deciding How Prompts Hold State

The Three Approaches

Full transcript

Rolling summary

Structured slots and state

The Axes That Separate Them

Reliability under length

Token cost

Engineering effort

Constrainability

A Decision Rule

Work through these in order

Hybrids are legitimate

Combining Approaches Deliberately

A layered pattern that works

Why layering beats purity

Common Mistakes in Choosing

Defaulting to transcript forever

Over-engineering a simple bot

Treating a summary as ground truth

Trusting a summary as ground truth

Frequently Asked Questions

Which approach is the safest default?

Can I switch approaches later?

Why not always use structured state if it is most reliable?

Is a rolling summary risky?

How does constrainability factor in?

What do most production agents end up using, and how do I know my current approach has stopped working?

Key Takeaways

Agency Script Editorial

Related Articles

Prompt Quality Decides Whether AI Earns Its Keep

Counting the Real Cost of Every Token You Send

Rolling Out AI Hallucinations Across a Team

Ready to certify your AI capability?