The open-versus-closed decision feels abstract until you put it against a real workload. The same trade-offs that seem evenly balanced in theory tilt decisively one way once you specify the volume, the data sensitivity, and the quality bar. This article walks through concrete use cases and explains precisely what pushed each toward open or closed.
These are composite scenarios drawn from patterns we see across teams, not named companies. The point is the reasoning. Read them and you will start recognizing which pattern your own workload matches.
Use Case 1: A Customer Support Chatbot at a Startup
A seed-stage SaaS company wants a support assistant that answers questions from their docs. Volume is low and unpredictable, the team is three engineers, and there is no sensitive data beyond normal support tickets.
Verdict: Closed API. Per-token cost is trivial at this volume, the team has no infrastructure capacity, and a closed model gets them live in days. Self-hosting here would mean spending weeks building inference infrastructure to save dollars per month. The cost math does not even need a spreadsheet.
What Made It Closed
- Low, bursty volume where pay-per-use is cheapest.
- A tiny team with no operational bandwidth for GPUs.
- No privacy requirement that an enterprise API tier could not meet.
Use Case 2: A Document Processing Pipeline at Scale
A logistics firm processes millions of shipping documents a day, extracting structured fields. The task is well-defined, the volume is enormous and steady, and latency budgets are tight.
Verdict: Self-hosted open-weight. At this volume, per-token API pricing would be a massive recurring bill, and the extraction task does not require frontier reasoning. A fine-tuned open model running on owned GPUs delivers the same accuracy at a fraction of the per-task cost, and the fixed-cost structure is easier to plan around.
This is the textbook case where open earns its keep: high, predictable volume on a routine task. The step-by-step approach shows how to run the cost comparison that makes this obvious.
Use Case 3: A Healthcare Note Summarizer
A hospital system wants to summarize clinical notes. The data is protected health information under strict residency rules, and an audit must show that patient data never left the hospital's controlled environment.
Verdict: Self-hosted open-weight, driven by privacy. Here the deciding factor is not cost at all; it is the architectural requirement that data physically stay in place. A contractual no-retention promise from a closed provider may not satisfy the hospital's auditors, so self-hosting open weights is the only path that clears the bar.
What Made It Open
- A hard data-residency requirement that needs an architectural, not contractual, guarantee.
- Acceptable to invest in infrastructure because compliance is non-negotiable.
- Task difficulty well within open-model capability.
Use Case 4: A Frontier Research Assistant
A small research team needs an assistant for the hardest multi-step reasoning and long-context analysis available, and they need the newest capabilities as soon as they ship. Volume is modest.
Verdict: Closed frontier model. The absolute best reasoning capability still tends to appear first in closed models, and at this team's modest volume the cost is immaterial. Chasing the frontier with self-hosted models would mean constantly trailing the newest releases. When you need the single best model this quarter, closed is where it lives.
Use Case 5: A Product Feature Handling Customer Data
A mid-size company builds an AI feature into their product. Volume is growing but not yet enormous, some customer data is involved, and they want flexibility to negotiate cost and avoid vendor lock-in over time.
Verdict: Hybrid, with routing. They start on a closed API to ship fast, abstract the model behind a thin interface, and move high-volume routine calls to a managed open model as usage grows, while keeping the hardest calls on the closed frontier model. This portfolio approach hedges cost and lock-in without sacrificing speed.
This is the most common mature pattern, and it is why our best practices article recommends defaulting to closed but building for portability from day one.
What These Cases Have in Common
Across every scenario, the deciding factors are the same handful of properties: volume, data sensitivity, task difficulty, and team capacity. Notice that none of the decisions came from ideology. The startup did not pick closed because they distrust open source; they picked it because the math and timeline were obvious.
When you face your own decision, map your workload onto these properties first. If you want a reusable structure for doing that, our framework article turns these patterns into a repeatable model.
Use Case 6: An On-Device Mobile Feature
A consumer app wants an AI feature that runs even with no network connection, like on-device text suggestions or photo descriptions. Privacy is a selling point, and latency must feel instant.
Verdict: Open-weight, distilled and quantized to run on-device. A closed API is impossible here by definition, because there is no network call to make. The team takes an open model, distills it into a small variant, and quantizes it to fit on a phone's hardware. The customization freedom of open weights is the only thing that makes this category possible at all.
What Made It Open
- The feature must run with no network, ruling out any API.
- On-device privacy is a product promise customers can verify.
- Open weights allow distillation and quantization to fit device constraints.
A Failure Worth Studying
Not every story ends well, and the instructive ones are the failures. Consider a team that chose self-hosted open models for a moderate-volume internal tool purely because leadership wanted to "own the stack." Volume never grew large enough to justify the GPUs, the two engineers maintaining the inference setup were pulled off product work, and the tool's quality lagged because nobody had time to keep the model current.
The fix, once they admitted it, was to move the workload back to a closed API. The cost rose slightly on paper but fell dramatically once engineering time was counted, and the freed engineers returned to product. The lesson is blunt: choosing open for ideological reasons rather than workload properties is itself a failure mode, and the same logic that recommends open for high-volume extraction recommends against it here.
Frequently Asked Questions
Does high volume always mean I should self-host an open model?
Not always, but it shifts the math strongly. High and predictable volume on a routine task is the strongest case for self-hosting. If the high-volume task also needs frontier reasoning, or your team cannot run inference reliably, a closed API or managed open hosting may still win.
Can privacy requirements be met without self-hosting?
Sometimes. Closed enterprise tiers with no-retention terms and regional hosting satisfy many compliance regimes. The exception is when you need an architectural guarantee that data physically never leaves your environment, which only self-hosted open weights provide.
Why not just use the best frontier model for everything?
Because at scale it gets expensive, and most tasks do not need frontier capability. Routing routine, high-volume work to cheaper open models and reserving the frontier model for hard tasks usually cuts cost substantially without hurting quality.
How do I know if my task needs a frontier model?
Test it. Run both a frontier closed model and a strong open model against your eval set. If the open model meets your quality bar, you do not need the frontier model for that task. The only reliable way to know is measurement, not assumption.
Key Takeaways
- Low, bursty volume with a small team almost always favors a closed API.
- High, steady volume on routine tasks is the clearest case for self-hosted open-weight models.
- Hard data-residency requirements push toward open self-hosting for architectural guarantees.
- The absolute frontier of capability still tends to appear first in closed models.
- The common mature pattern is a hybrid portfolio decided by workload properties, not ideology.