Ad hoc model decisions do not scale. Every new workload turns into the same debate, and the answer depends on whoever argues hardest that day. A framework fixes this by giving you a fixed set of lenses to look through, so the decision comes from the workload rather than the room.
This article introduces SCALE: five components you evaluate for any workload to reach a defensible open-versus-closed decision. It is deliberately simple, because a framework nobody remembers is a framework nobody uses. Apply all five, weigh them against your workload, and the answer usually becomes obvious.
Why a Framework Beats Case-by-Case Judgment
Case-by-case judgment produces inconsistent decisions, no institutional memory, and endless relitigation. A framework gives you repeatability, a shared vocabulary, and a written rationale that survives staff turnover and the next viral headline.
The goal is not to remove judgment but to channel it. SCALE tells you which factors to weigh; you still decide how they apply to your specific situation. That balance is what makes it usable across very different workloads.
The SCALE Framework
SCALE stands for Sensitivity, Capability, Architecture, Liability, and Economics. You evaluate each lens for the workload in front of you, then synthesize. No single lens decides alone, but some act as hard gates.
S: Sensitivity
How sensitive is the data, and what guarantee does that demand? This is often a hard gate. If a residency rule requires that data physically never leave your environment, only self-hosted open weights qualify, and the other lenses cannot override it. If contractual no-retention guarantees suffice, closed stays in play.
C: Capability
How hard is the task, and does it need the frontier? Route this honestly. The very best reasoning still tends to appear first in closed models, so frontier-dependent tasks lean closed. Routine extraction, classification, and summarization are well within open-model range, so they do not justify frontier pricing.
A: Architecture
How much control and customization do you need over the model itself? If you need deep fine-tuning, quantization, or distillation into smaller models, open weights give you freedom an API cannot. If the customization available through a provider's API is enough, closed is simpler. Our examples article shows where this lens decided the outcome.
L: Liability
Who absorbs operational and vendor risk, and can your team carry it? Self-hosting open models means you own inference reliability, security patching, and version migration. Closed APIs shift that to the provider but introduce deprecation risk on their schedule. Be honest about your operational maturity; this lens is where teams overestimate themselves.
E: Economics
What is the total cost of ownership at your real volume? Price both paths, including engineering time for the open path, and compare on cost per successful task. Low or bursty volume favors closed; high, steady volume favors open. This lens is quantitative, so let the numbers, not instinct, speak.
Applying SCALE: From Lenses to a Decision
Evaluate each lens, noting any hard gates first. Sensitivity and Liability frequently act as gates: a residency requirement forces open, while a thin operational team forces closed or managed hosting. If a gate fires, the decision is largely made and the remaining lenses simply confirm or refine it.
When no gate fires, weigh Capability, Architecture, and Economics together. A high-volume routine task with no special customization needs and a healthy ops team points clearly to open. A low-volume frontier-reasoning task points clearly to closed. The framework's value is that two different people running it on the same workload reach the same answer.
A Worked Synthesis
- Residency gate fires → self-hosted open, regardless of other lenses.
- No ops capacity → closed or managed hosting, regardless of cost appeal.
- No gates, high volume, routine task → open-weight, validated by Economics.
- No gates, low volume, frontier task → closed, validated by Capability.
SCALE as a Routing Engine, Not Just a One-Time Choice
The framework's real power emerges when you stop applying it once and start applying it per task. Decompose a multi-step workload into its component tasks and run SCALE on each. You will usually find that different tasks land on different sides, which is the foundation of a routed portfolio.
This is exactly how mature teams operate: a closed frontier model for the hardest steps, open-weight models for high-volume routine steps, with routing logic in between. To make that routing durable, pair SCALE with the practices in our best practices article, especially the thin model interface and the living eval set. For the ordered process to gather the evidence each lens needs, see the step-by-step approach.
Where SCALE Fits in a Decision Process
SCALE is a lens framework, not a procedure. It tells you what to weigh, while a separate process tells you how to gather the evidence. The two work together: run the ordered process to collect data—volume figures, an eval set, cost estimates—then view that data through the five SCALE lenses to reach the call.
In practice this means you do not evaluate the lenses from intuition. The Economics lens consumes the cost model you built. The Capability lens consumes your bake-off results. The Sensitivity and Liability lenses consume your documented constraints and an honest read of your team. SCALE without underlying evidence is just opinion with structure; SCALE on top of real measurement is a decision you can defend.
The Two Layers
- Evidence layer: Workload characterization, eval set, cost model, constraint screen.
- Judgment layer: The five SCALE lenses, applied to that evidence, producing the decision.
A Lightweight Version for Fast Decisions
Not every decision warrants the full treatment. For low-stakes or low-volume workloads, run SCALE in its compressed form: ask only whether any hard gate fires. Does Sensitivity demand an architectural guarantee? Does Liability exceed your team's capacity? If neither gate fires and volume is modest, default to a closed API and move on. You can always revisit if the workload grows.
This lightweight pass takes minutes and prevents the framework from becoming bureaucracy. Reserve the full evidence-gathering process for high-stakes decisions: workloads with large volume, regulated data, or significant cost exposure. Matching the depth of analysis to the stakes of the decision is itself part of using SCALE well.
Frequently Asked Questions
Which SCALE lens should I evaluate first?
Sensitivity and Liability, because they act as hard gates that can settle the decision before the other lenses matter. If a residency requirement or an operational capacity limit fires, you have your answer and the remaining lenses simply confirm it.
Can SCALE give a clear answer when lenses conflict?
Yes, by respecting the gates. Sensitivity and Liability override the others when they fire. When no gate fires, Economics and Capability usually break the tie, since one is quantitative and the other is testable against your eval set. Genuine ties are rare once you measure.
Is SCALE only for large organizations?
No. A solo developer can run it in fifteen minutes. The lenses scale down cleanly: a small team will often find Liability points them firmly to closed or managed hosting, which is a perfectly valid and fast answer.
How does SCALE handle new models being released?
It does not change. The lenses are about workload properties, not specific models. When a new model ships, only your Capability and Economics evidence updates via a re-run bake-off. The framework itself stays stable, which is what makes it reusable over time.
Key Takeaways
- SCALE evaluates Sensitivity, Capability, Architecture, Liability, and Economics for any workload.
- Sensitivity and Liability often act as hard gates that settle the decision early.
- When no gate fires, Economics and Capability usually break the tie with measurable evidence.
- Apply SCALE per task to decompose workloads and build a routed portfolio.
- The framework is durable because it weighs workload properties, not specific models.