What SR-SI actually is - and why I stopped explaining it as an AI trick

What SR-SI actually is - and why I stopped explaining it as an AI trick

For the past year, whenever I described SR-SI to someone, I’d watch their eyes settle on the wrong thing.

They’d hear “AI memory” and file it under prompt engineering. They’d hear “context management” and assume I meant chunking strategies or retrieval-augmented generation. They’d hear the results — 106x improvement in token coherence across a 66,475-line production codebase — and assume I was talking about a clever trick.

It’s not a trick. It’s an architecture. And the distinction matters more than it sounds.

The problem it solves

Large language models are fluent but structurally forgetful. They appear confident at every point in a conversation — they’ll answer your question about component X with authority, even after 300 prompts of architectural drift have made their internal model of your codebase completely wrong.

The confidence is the danger. I started calling this the Cognitive Mirage: the model looks coherent while its working model of your system quietly unravels.

The standard response to this is to write better prompts. Give it more context. Use a bigger context window. Chunk your documents and retrieve them at query time.

These interventions treat the symptom. They assume the problem is retrieval — that the model has the information somewhere and just needs better access to it.

SR-SI is based on a different diagnosis: the problem is architecture. The model was never given a stable representation of the system to work from. You’re asking it to hold a mental map of your project through conversational inference, and conversational inference degrades under load.

What SR-SI actually does

Simulated Recall via Shallow Indexing externalizes architectural memory into a compact, machine-optimized index that the model generates, maintains, and consults before every task.

Not a README. Not a wiki. Not a context dump. An index — the same way a book’s index differs from its contents.

The index doesn’t contain the knowledge; it contains the map to where the knowledge lives and the structural relationships between the parts that matter.

The model builds this index. It maintains it as the project evolves. It consults it before acting.

This converts the model from a stateless conversational agent into what the paper calls a persistent system collaborator: something that can re-orient itself on demand rather than relying on what happened to be in the last 20 messages.

What the numbers actually mean

The empirical results across three real projects:

Without SR-SI: coherence drift became measurable and operationally significant at around 200 prompts. Token efficiency baseline: ~2.56 tokens per line of code. Context-maintenance cost: 15,641 tokens per task.

With SR-SI (monolithic index): 85.5% reduction in context-maintenance cost. No measurable context-loss event across 1,006+ prompts.

With SR-SI (modular sub-indices by feature area): per-task context load dropped from 15,641 tokens to approximately 1,645 tokens. The Token Coherence Metric — LOC divided by net tokens — moved from a baseline of 0.38 to 40.4. That’s the 106x figure.

It’s not a benchmark result; it’s a measurement taken on a live 66,475-line production codebase over months of active development.

The key finding that took me longest to articulate: modularization alone does nothing. Breaking your codebase into modules without the indexing workflow produced no coherence improvement in the baseline comparison.

The gains come entirely from the index architecture, not from code organization.

Why I stopped calling it an AI trick

A trick is a local optimization. It works in the specific context where you apply it and doesn’t generalize.

SR-SI is a workflow change — it changes the relationship between the developer, the model, and the codebase. Once it’s in place, documentation becomes a zero-cost byproduct rather than a separate task.

Onboarding a new AI agent to the project becomes minutes-long rather than days-long re-teaching.

The reason this matters beyond software development — and why I’ve spent the past six months applying it across five separate products and a consulting practice — is that the same architectural problem it solves in AI-assisted development also exists in human teams.

Knowledge that lives only in conversation degrades. Teams that never externalize their architectural understanding into an index pay a coherence tax on every decision, every handoff, every new hire.

SR-SI started as a solution to an AI problem. It turned out to be a solution to a knowledge architecture problem that just happens to also apply to AI.

The full paper is linked below. It’s 54 pages. You don’t need to read all of it — the executive summary and sections 5 and 7 are where the architecture and empirical results live. Start there.

SR-SI: The methodology that gives AI persistent memory across any long-running project