2026-05-14
The case against ACE-style self-managed memory for systems that need model-independence.
Architectures like ACE (Autonomous Cognitive Entity) propose agents that manage their own memory stores — episodic, semantic, procedural — accumulating institutional knowledge over time, analogous to a knowledge worker warming up to an organization.
The appeal: an agent that gets better with tenure. Knows which sources are reliable, which approaches fail, how the org’s systems actually behave (not how the docs say they behave).
When a model writes a memory, it writes for itself:
Swap the model, and the memories become someone else’s diary — legible but miscalibrated.
The deeper issue: memory isn’t just “what you saw.” It’s the output of a path through experience-space where the model’s intermediate thoughts are inputs to the next step.
Model sees raw data
→ Thinks "X is interesting" (model-specific salience)
→ Explores X (generates new experiences that only exist because of this choice)
→ Finds pattern P (only findable via the X path)
→ Writes memory about P
→ Future behavior shaped by P
→ Encounters new data through P-shaped lens
→ ...
A different model, given identical initial inputs:
Same org, same data, different model = different knowledge. Not slightly different — divergently different, from step 1.
This path dependence isn’t only cross-model. The same model, given identical inputs, diverges across runs due to sampling stochasticity — a different token chosen at step 3 changes what’s “interesting” at step 4, which changes which data is generated at step 5. Model differences amplify this, but randomness alone is sufficient. Any system where intermediate outputs feed back as inputs to the same process is sensitive to small perturbations in time — and LLM inference is inherently non-deterministic. The combination of self-referential thought (each agent’s outputs becoming its own future inputs) and temporal stochasticity (nudges that differ across runs) means that divergent outcomes are the default, not the exception.
The attractive middle ground: “store raw experiences, replay through the new model on upgrade.”
This fails because:
You cannot separate “the inputs” from “the model’s reaction to inputs” because reactions become inputs to subsequent steps. It’s a recurrence relation with the model on both sides.
In practice, long-lived agents experience model updates over time:
Month 1: Model A writes memory layer (salience pattern A, framing A)
Month 3: Model B reads layer A, writes layer B (different framing)
Month 6: Model C reads layers A+B, writes layer C
Each layer’s salience decisions were made by a different cognitive process. The memory store becomes a geological record of different minds’ opinions. Worse — layers interact:
Confirmation cascade through time, with no person in the loop to ask “who decided this, and were they right?”
Human institutional knowledge benefits from continuity of consciousness:
An LLM has none of this. Each call is a stranger reading a dead person’s notes. The “dead person” was a previous model version — computationally unrelated to the current one.
Pick a model version. Never update. Memory stays coherent. Trade capability growth for knowledge continuity.
Equivalent to: “this employee can never be replaced, and never learns new skills.”
Don’t accumulate model-subjective knowledge. Every task starts fresh. Store artifacts (raw outputs, structured data, tool results) not understanding. Let whatever model is current form its own interpretations each time.
Cost: redundant re-derivation. Benefit: model-swap is free, never wrong due to stale memory.
Only persist structured, verifiable, non-interpretive data:
Problem: the interpretive layer IS the valuable part. Facts without interpretation is just a database.
When upgrading, run the new model alongside the old. The new model builds fresh knowledge with access to the old model’s raw artifacts (not memories). Retire the old model when the new one’s contextual knowledge surpasses it.
Equivalent to: onboarding a new hire while the departing employee is still available.
Factosis chooses option 2: agents are temps, not employees.
Each investigation is a fresh engagement by an extremely capable temp with perfect access to structured artifacts. The temp forms brilliant working interpretations for the duration of the task, leaves behind a complete audit trail, and departs. The next investigation starts fresh with access to prior artifacts but no prior opinions.
Knowledge persists as:
abandoned_angles field (tiny, overwritten each loop, prevents re-investigation of dead ends within a single investigation)Knowledge does NOT persist as:
This is a deliberate tradeoff: redundant re-derivation in exchange for model-independence and zero stale-memory failure modes.
If Factosis ever needs cross-investigation knowledge (e.g., “last time we saw this pattern, the root cause was X”), the mechanism should be:
The retrieval mechanism would surface prior raw data, not prior conclusions.
For speculative exploration of what would work if Factosis ever needed cross-investigation memory, see Consensus-Based Agent Memory — agreement gates, corporate dysfunction pathologies, and the devil’s advocate problem.
For a mitigation of within-investigation path dependence, see Convergence Testing — run N temps with the same brief and compare outcomes.