Path-Dependent Memory

2026-05-14

Path-Dependent Memory: Why Long-Lived Agent Memory Doesn’t Transfer

The case against ACE-style self-managed memory for systems that need model-independence.

The Premise

Architectures like ACE (Autonomous Cognitive Entity) propose agents that manage their own memory stores — episodic, semantic, procedural — accumulating institutional knowledge over time, analogous to a knowledge worker warming up to an organization.

The appeal: an agent that gets better with tenure. Knows which sources are reliable, which approaches fail, how the org’s systems actually behave (not how the docs say they behave).

The Problem: Memory Is Model-Subjective

When a model writes a memory, it writes for itself:

  • What it finds salient (another model would notice different things)
  • How it frames the experience (compression choices are model-specific)
  • What it omits as “obvious” (obvious-to-whom varies by model)
  • Retrieval cues optimized for its own attention patterns

Swap the model, and the memories become someone else’s diary — legible but miscalibrated.

Path Dependence in Cognition

The deeper issue: memory isn’t just “what you saw.” It’s the output of a path through experience-space where the model’s intermediate thoughts are inputs to the next step.

Model sees raw data
  → Thinks "X is interesting" (model-specific salience)
  → Explores X (generates new experiences that only exist because of this choice)
  → Finds pattern P (only findable via the X path)
  → Writes memory about P
  → Future behavior shaped by P
  → Encounters new data through P-shaped lens
  → ...

A different model, given identical initial inputs:

  • Finds Y interesting instead of X
  • Explores Y, generating different follow-up experiences
  • Finds pattern Q (unreachable from the X path)
  • Builds entirely different institutional knowledge

Same org, same data, different model = different knowledge. Not slightly different — divergently different, from step 1.

This path dependence isn’t only cross-model. The same model, given identical inputs, diverges across runs due to sampling stochasticity — a different token chosen at step 3 changes what’s “interesting” at step 4, which changes which data is generated at step 5. Model differences amplify this, but randomness alone is sufficient. Any system where intermediate outputs feed back as inputs to the same process is sensitive to small perturbations in time — and LLM inference is inherently non-deterministic. The combination of self-referential thought (each agent’s outputs becoming its own future inputs) and temporal stochasticity (nudges that differ across runs) means that divergent outcomes are the default, not the exception.

Why Replay Fails

The attractive middle ground: “store raw experiences, replay through the new model on upgrade.”

This fails because:

  1. Intermediate thoughts diverge — the new model’s reasoning during replay produces different “what’s interesting” signals
  2. Actions would have differed — different salience → different queries → different data generated → experiences that don’t exist in the replay log
  3. The replay log is path-shaped — it’s a choose-your-own-adventure playthrough by a different mind. The new model would never have walked this path.
  4. Compounding divergence — small differences in early reasoning compound into completely different knowledge structures

You cannot separate “the inputs” from “the model’s reaction to inputs” because reactions become inputs to subsequent steps. It’s a recurrence relation with the model on both sides.

The Layer Cake Problem

In practice, long-lived agents experience model updates over time:

Month 1:  Model A writes memory layer (salience pattern A, framing A)
Month 3:  Model B reads layer A, writes layer B (different framing)
Month 6:  Model C reads layers A+B, writes layer C

Each layer’s salience decisions were made by a different cognitive process. The memory store becomes a geological record of different minds’ opinions. Worse — layers interact:

  • Model A over-reacts to a timeout, writes “Service X unreliable”
  • Model B retrieves that memory, avoids Service X, writes “used Y as workaround”
  • Model C sees two memories reinforcing “X is unreliable” — never re-verifies the original judgment

Confirmation cascade through time, with no person in the loop to ask “who decided this, and were they right?”

Why Human Institutional Knowledge Works (And Agent Memory Doesn’t)

Human institutional knowledge benefits from continuity of consciousness:

  • You partially remember why you wrote a note
  • You can calibrate against your own past judgment (“I over-react to outages”)
  • You re-derive from the same raw experience when memory feels stale

An LLM has none of this. Each call is a stranger reading a dead person’s notes. The “dead person” was a previous model version — computationally unrelated to the current one.

The Options

1. Accept model-lock

Pick a model version. Never update. Memory stays coherent. Trade capability growth for knowledge continuity.

Equivalent to: “this employee can never be replaced, and never learns new skills.”

2. Accept ephemerality

Don’t accumulate model-subjective knowledge. Every task starts fresh. Store artifacts (raw outputs, structured data, tool results) not understanding. Let whatever model is current form its own interpretations each time.

Cost: redundant re-derivation. Benefit: model-swap is free, never wrong due to stale memory.

3. Restrict to model-agnostic facts

Only persist structured, verifiable, non-interpretive data:

  • “API returned 403 on 2026-04-12” (fact)
  • NOT “API X is unreliable” (interpretation)

Problem: the interpretive layer IS the valuable part. Facts without interpretation is just a database.

4. Overlap model tenures

When upgrading, run the new model alongside the old. The new model builds fresh knowledge with access to the old model’s raw artifacts (not memories). Retire the old model when the new one’s contextual knowledge surpasses it.

Equivalent to: onboarding a new hire while the departing employee is still available.

Factosis’s Position

Factosis chooses option 2: agents are temps, not employees.

Each investigation is a fresh engagement by an extremely capable temp with perfect access to structured artifacts. The temp forms brilliant working interpretations for the duration of the task, leaves behind a complete audit trail, and departs. The next investigation starts fresh with access to prior artifacts but no prior opinions.

Knowledge persists as:

  • Structured state — hypothesis store, source registry (model-agnostic, controller-validated)
  • Raw artifacts — tool outputs, data files, git history (reinterpretable by any model)
  • Anti-loop memoryabandoned_angles field (tiny, overwritten each loop, prevents re-investigation of dead ends within a single investigation)

Knowledge does NOT persist as:

  • Model-subjective interpretations
  • Accumulated “experience”
  • Cross-investigation memory

This is a deliberate tradeoff: redundant re-derivation in exchange for model-independence and zero stale-memory failure modes.

Implication for Cross-Investigation Learning

If Factosis ever needs cross-investigation knowledge (e.g., “last time we saw this pattern, the root cause was X”), the mechanism should be:

  • Store raw findings and artifacts from prior investigations (already done — git repos)
  • Let the current model re-interpret prior artifacts when relevant
  • Never store “lessons learned” as model-authored prose that future models inherit uncritically

The retrieval mechanism would surface prior raw data, not prior conclusions.


For speculative exploration of what would work if Factosis ever needed cross-investigation memory, see Consensus-Based Agent Memory — agreement gates, corporate dysfunction pathologies, and the devil’s advocate problem.

For a mitigation of within-investigation path dependence, see Convergence Testing — run N temps with the same brief and compare outcomes.