2026-05-17
Build with the LLM. Don’t put the LLM in the thing. Related: Deterministic Layer Cakes, Path-Dependent Memory.
There are two ways to use an LLM in a product:
Most of what ships as “AI-powered” is option 2 applied to problems that need option 1. Input classification that should be a lookup table. Template selection that should be a rule engine. Data extraction that should be a parser. FAQ responses that should be a search index.
This isn’t innovation. It’s outsourcing thinking to an API because building a rule engine requires understanding the domain. The demo ships in an afternoon, works 85% of the time, and locks you into per-token pricing forever for something that should have been a config file.
The sane architecture: a system that starts with the LLM in the hot path and progressively removes it — crystallising patterns into deterministic code until the LLM only handles what’s genuinely novel.
You don’t need a “confidence-scoring router layer.” The code’s own failure paths are the router.
try the deterministic thing
↓ worked? → done, cheap, fast
↓ didn't work? → fallback to LLM, log WHY it fell through
The catch, the else, the regex non-match, the lookup miss — these aren’t errors. They’re discovery signals. The system telling you “here’s a shape I don’t have a rule for yet.”
Every deterministic handler naturally has escape hatches:
else in a case statement → unrecognised categoryThese escapes route to the LLM. The LLM handles it, the result gets logged. No special routing logic needed — the code structure is the router.
The controller watches the fallback log:
"fell through at: input_classifier, reason: no match" — seen 3 times
"fell through at: input_classifier, reason: no match" — seen 47 times
"fell through at: input_classifier, reason: no match" — seen 200 times
↓
threshold hit
↓
ask LLM: "here are 200 inputs that fell through at this point,
and here's what you answered each time.
write me the code that handles them."
↓
validate against historical pairs → promote → deterministic path extended
The system grows at its failure points. Each promoted handler has its own try/catch/else, which accumulates its own fallbacks, which eventually crystallise into the next extension. The tree grows at its leaves.
Strip away the language and what you have is:
noise → filter → signal → pattern detect → codify → filter gets better
The fallback bucket is unprocessed signal — stuff that hasn’t found a home yet. The pattern detection threshold is: “enough similar shapes in the fallback bucket that it’s worth building a filter for them.”
This is what DSP has done forever. Matched filters, adaptive thresholds, training sequences. The LLM’s role is: the thing that can look at a cluster of unstructured signals and write the matched filter for them. It’s the engineer, not the system.
The actual runtime:
| Component | Role | Cost |
|---|---|---|
| Filter bank | Deterministic handlers, fast path | Negligible |
| Unmatched buffer | Accumulating fallbacks, monitored | Storage only |
| Threshold trigger | ”Enough similar failures” detector | Statistical, simple |
| Filter generator | LLM, invoked rarely, offline/batch | Per-token, but infrequent |
The LLM is barely in the runtime picture. It’s a batch job that fires when the unmatched buffer says “I’ve got something for you.” This is not an AI product. It’s a signal processing system with an automated engineer on retainer.
You don’t have to start cold. If you have historical flows, decisions, outcomes — the crystallisation can be pre-seeded before the system goes live.
1. Dump historical data
Past decisions, inputs, outputs, corrections, things that were ignored.
2. LLM-assisted triage (one-time batch cost)
3. Signal vs noise separation
If 60% of historical inputs were noise that led to “discard/no action” — that’s your first rule. Don’t even route to the LLM. Pattern: looks like X → discard. Massive cost savings from day one.
4. Decision tree extraction
For signal clusters: “Given these 200 examples that all resulted in outcome Y, what’s the minimum decision logic?” The LLM drafts a tree. You validate against the historical pairs.
5. Stability scoring
Historical data reveals:
6. Edge case catalog
Historical corrections, escalations, exceptions → don’t crystallise. Tag as “LLM territory” with few-shot examples ready.
Boot sequence result:
Historical data
↓
LLM batch analysis (one-time cost)
↓
├── Noise filters → discard rules, live immediately
├── Stable patterns → deterministic handlers, live with monitoring
├── Volatile patterns → LLM path, observe for stabilisation
└── Known edges → LLM path, few-shot primed
↓
System goes live with 60-70% already crystallised
↓
Organic discovery handles the rest
Crystallised code doesn’t care who wrote it. It’s just code. So you can:
No model lock-in. The rules are the artifact, not the model. This aligns with Path-Dependent Memory’s position: persist deliverables, not opinions. A crystallised rule is testable against concrete input/output pairs — it either handles the case correctly or it doesn’t, regardless of which model generated it.
The one caveat: if you swap models and the new model draws the fallback boundary differently (what counts as “matched” vs “fell through”), you get novel inputs matching stale rules. Mitigation: continuous validation of promoted rules against live traffic. Drift detection demotes rules that stop working.
The subsidy problem: most “AI features” are priced below cost to capture market share. When real pricing hits, every product with an LLM in the hot path either eats the margin or passes it on. The cost cascades down to the consumer — an invisible tax embedded in the infrastructure layer.
A self-crystallising system inverts this:
Cost trends down over time instead of scaling linearly with usage. The LLM earns its keep only on the frontier. Everything behind the frontier is solved and runs for negligible cost.
The labs won’t sell you this. Their business model is the meter running. A system designed to need itself less over time is antithetical to per-token pricing. You’d have to build it yourself.
The pattern mirrors how you should build any product:
The self-crystallising system is this loop automated. The LLM plays the role of “the founder doing things manually at first” — handling everything, observing what repeats, then writing the automation that replaces itself.
A signal processing system where:
The LLM’s role: automated engineer on retainer, not runtime dependency. Called rarely, for batch work, to extend the system. Not sitting in the hot path burning tokens on things that should be an if statement.
For why accumulated LLM-authored memory doesn’t transfer across model versions (but crystallised code does), see Path-Dependent Memory.
For the argument that nodes are unreliable and trust belongs in the plumbing, see Deterministic Layer Cakes.