2026-05-12
Speculative design exploration — not planned for implementation. For the argument that motivates this (why self-authored memory fails), see Path-Dependent Memory. For the related but distinct question of whether a single investigation’s conclusion is reliable, see Convergence Testing.
Human institutional knowledge has the same path-dependence problem — different people notice different things, walk different paths, form different interpretations. Yet org knowledge is robust to employee turnover. Why?
Because consensus, not individual memory, is what persists.
Org knowledge isn’t the union of everyone’s memories. It’s the negotiated overlap — what survived collision between differently-minded people:
Individual memory is a cache. The consensus layer is the durable store.
Negotiation strips private framing — if two minds disagree on how to frame something, they find language that works for both. That shared language is more likely to work for a third.
Verification through disagreement — one mind’s over-reaction gets challenged. “Service X is unreliable” only becomes org knowledge if a second mind independently confirms it.
Multiple paths confirming one conclusion — if mind A reached conclusion P via path X and mind B reached P via path Y, P is more likely to be a property of the territory than an artifact of one path through it.
Shared language over private notation — consensus produces artifacts in common terms, not any individual’s idiosyncratic shorthand.
Model A investigates → forms interpretation P
Model B investigates same raw data → forms interpretation Q
↓
Agreement gate: what do P and Q share?
↓
Consensus artifact: the overlap
(validated by ≥2 cognitive processes)
↓
THIS is what you persist.
Not A's memory. Not B's memory. The agreement.
Factosis already does this in miniature — the completion audit (Opus validating Sonnet’s conclusions). But it’s one-directional and terminal. A full agreement-based memory would run the gate throughout:
┌─────────────────────────────────────────────────┐
│ INVESTIGATION (Model A) │
│ Forms interpretations, writes raw findings │
└──────────────────────┬──────────────────────────┘
│ candidate knowledge
▼
┌─────────────────────────────────────────────────┐
│ AGREEMENT GATE (Model B — adversarial) │
│ Re-derives from same raw data independently. │
│ │
│ AGREE → consensus artifact (persist) │
│ DISAGREE → individual opinion (discard) │
│ PARTIAL → persist only the agreed subset │
└──────────────────────┬──────────────────────────┘
│ validated
▼
┌─────────────────────────────────────────────────┐
│ CONSENSUS STORE │
│ - Model-agnostic (verified by ≥2 processes) │
│ - Shared language (not private notation) │
│ - Re-verifiable (raw data still accessible) │
│ - Expirable (re-run gate periodically) │
└─────────────────────────────────────────────────┘
Agreement-based persistence inherits the failure modes of the human orgs it mimics:
The uncomfortable conclusion: follow this thread far enough and you’ve rebuilt corporate politics — performance reviews, culture fit, normie consensus, and the systematic firing of mavericks who were right but outnumbered.
This is strictly better than unchallenged self-authored memory (one model’s hallucination can’t compound unchecked). But it is not a solution to the fundamental problem. It trades individual hallucination for collective blind spots. The failure mode is quieter, slower, and harder to detect — which may make it worse.
The pathologies above suggest a fix: keep a dedicated “divergent thinker” model whose job is to disagree. This is not a new idea — org theory has studied it for decades under various names:
The core finding across all of these: constructive dissent is disproportionately valuable but systematically undervalued, because:
For agents this maps exactly:
Option A: Keep a "red team" model that challenges consensus
Cost: extra model call on every candidate memory
Noise: mostly disagrees pointlessly
Value: occasionally catches the blind spot that would have compounded silently
Measurable? No. Not until the counterfactual materialises.
Option B: Don't keep it
Cost: zero
Risk: collective blind spots ossify into unchallengeable canon
Measurable? Also no. You never see the catastrophe you didn't prevent.
The valuation problem is identical to the human org version: you cannot measure the value of dissent before it’s proven right. Any metric that tracks “agreement rate” or “consensus contribution” will systematically eliminate the most valuable divergent signals.
This is the unsolved problem at the bottom of agreement-based persistence. Every org that has solved it (red teams, skunkworks, 10th man) solved it by making dissent a structural role, not an emergent behaviour — and by explicitly protecting that role from the consensus mechanism’s natural tendency to eliminate it.
The agent equivalent would be: a model that is prompted to disagree, whose disagreements are preserved regardless of whether the consensus gate accepts them, and whose track record is evaluated only in retrospect against ground truth. Expensive. Mostly noise. Occasionally the only thing between you and a silent failure cascade.
The current architecture already has the primitives:
A future cross-investigation memory layer could extend this: persist only conclusions that a second model independently derives from the same raw artifacts. Not “what Sonnet thinks” — “what Sonnet and Opus both conclude from the same evidence.”
This is strictly better than self-authored memory, but strictly more expensive. The tradeoff becomes: cost of re-derivation vs. cost of agreement protocol. For high-value, long-lived org context, the agreement cost may be justified.