MAY 18, 2026

10 MIN READ

FLORIAN VALEYE

Altertable Agent, Architecture, Engineering

Memory Is Not a Database

Intelligence without memory is nothing. The right model is memory that lives, forgets, and knows where it belongs.

Share

Intelligence without memory is nothing. But memory that never forgets isn't intelligence.

It's a log with a search box.

An agent that persists is categorically different from one that restarts every session: different in kind, not just in capability. An agent that remembers which hypotheses it already ruled out, which user prefers their data broken down by cohort, which anomaly the team investigated and closed last Tuesday, is not the same tool. One that learns. One that deepens.

It helps to be precise about what "memory" means here, because there are two distinct concepts that often get conflated. What an agent holds during a run is working memory. It's bounded and ephemeral: key findings, observations, open questions, gone when the session ends unless the agent deliberately writes something out. Long term memory is what crosses that boundary: the facts, patterns, and techniques that survive between runs, scoped and indexed so they can be retrieved in future sessions.

Without the second, every session starts from zero. With it, what the agent learns persists. It picks up where the last run left off, it carries what each user has taught it without being re-asked, every run leaves the next one a little sharper, and important lessons reach beyond the agent that learned them.

But an agent that remembers everything, that stores every episode, every observation, every confident but wrong inference, and never lets go, accumulates noise as readily as it accumulates signal. The length of memory and the quality of memory are not the same thing. One grows automatically. The other requires a system that knows how to forget. The context window the agent retrieves into is finite, and stale memories crowd out the useful ones. Session quality drops with every retrieval that fills the window with what was true a quarter ago instead of what matters now.

Most memory systems today come down to the same model: a database, fast retrieval, smarter ranking. The usual taxonomy maps human memory to implementation choices: sensory becomes numerical encodings, working memory becomes the context window, long term memory becomes a searchable store. It's a useful map. It stops at the territory's edge.

It tells you what memory is. It says nothing about what memory must do to stay true and relevant.

The right answer is a memory that lives: that consolidates what it learns, that knows where it belongs, and that forgets what it no longer needs to carry.

Memory has topology

The default way to split agent memory is personal versus shared. Personal is what one agent knows on its own. Shared is what every other agent in the same organization can access. The split matters, but it doesn't go far enough.

A finding from a single analysis run isn't organizational knowledge. It's an episode. It belongs to that workflow, that moment, that agent's working session. Surfacing it in an unrelated workflow analyzing a different product is noise that looks like context. Without scope, every observation becomes a candidate for every retrieval, and signal-to-noise collapses as the store grows.

At the same time, some knowledge is genuinely universal. A pattern that emerges independently across five workflows, the kind of fact the whole organization tracks as a KPI or remembers as a defining business event, isn't specific to any one workflow. It's institutional truth, a stable fact about the business that holds regardless of who's using it. Every agent that touches retention should be able to rely on it.

The shape of where a memory lives matters as much as what's in it. At Altertable, we model this across five scopes:

Memory topology: five scopes (organization, workflow, agent, entity, user) with promotion path

These boundaries map privacy, lifecycle, and governance onto the memory layer rather than just onto the data.

The organization boundary itself is hard, by design rather than by configuration. A memory written in one organization never reaches another.

A tempting alternative is to let the model decide where a memory belongs at write time, with no predefined hierarchy. It sounds simpler.

In practice, when scope is inferred rather than declared, you can't enforce access control without reading every memory's content, you can't assign different decay rates by scope without knowing what scope a memory will land in, and you can't build promotion rules across boundaries without a stable topology to promote across.

Explicit scopes are a constraint. That constraint is what makes everything else in the system predictable: access, lifetime, promotion, governance. We chose the constraint.

Memory as a living process

The right model isn't a store. It's a cycle.

We store three kinds of long term memory, and the distinction matters:

Episodic: the raw notes from a specific run, what the agent observed, what it found, what happened in this analysis at this moment.
Semantic: patterns abstracted from episodes. Not "I saw X on Tuesday" but "X is true."
Procedural: how to act, not what to know. Which query patterns reliably surface signal for a particular data shape. Which analysis paths tend to work. Which approaches have failed, and under what conditions.

When enough episodic memories converge on the same pattern, the system consolidates them into a semantic memory. Not a record of what happened. A fact about what's true.

That semantic memory then lives at the appropriate scope. If it keeps appearing across independent workflows, it promotes to the organization. If no agent or human retrieves it, it fades. If it gets retrieved repeatedly, its decay slows.

Procedural memory works differently. Each technique carries a success rate and the contexts where it failed. When the system merges them, the track record carries forward. The result isn't a smoothed-over fact. It's a technique with a history: here's what works, where it breaks, and how confident the system is.

Consolidation isn't maintenance. It's how the system learns.

The pipeline runs automatically. It collects memories old enough to have proven themselves, groups related ones, asks a language model to abstract the shared pattern, promotes what qualifies, and removes what doesn't. When new evidence lines up with a semantic memory that already exists, the system doesn't pick a winner. It re-abstracts the union and replaces what was there. Conflict isn't a vote. It's just another consolidation.

Every consolidated memory carries the IDs of the sources that produced it, so the chain of evidence stays queryable even after the source rows are archived.

The similarity threshold for consolidation is deliberately strict because looser values conflate related but distinct patterns into false commonalities, surfacing coincidence as fact. No one decided it was ready. The system noticed it had become true.

A concrete sequence to make this tangible: three workflows, run independently by different agents at different times, each discover from their own data that users who activate a third feature within 48 hours show substantially better retention at 30 days. No workflow knows the others exist.

Each writes the observation as a memory scoped to its own workflow, indexed for similarity search. On the next consolidation run, the system measures how close those memories are to each other. All three land close enough to cross the threshold.

A language model abstracts them into a single organization-level memory, with the three source IDs preserved as lineage. The originals are archived, but the new fact is stronger for being drawn from all of them. Every agent that now touches retention reasoning inherits the pattern, and the lineage records how many independent workflows had to discover it before the system believed it.

Exposing these operations as MCP tools allows any compliant host (Claude, Codex, or any agent speaking MCP) to read and write against the same store. The cost of remembering is paid once, the value reaches every consumer.

The open problems sit in two places.

Discoverability comes first: we instruct every workflow to query memory before working on referenced entities, so agents retrieve proactively rather than waiting to be asked, but they can only retrieve what they know to look for. The retrieval stays explicit. A query, a filter, a keyword. The associative layer that would surface the memory you didn't know to ask for is the piece still missing.

The second is verbatim recall. The abstraction survives, the lineage survives, the verbatim source episodes do not. Consolidation keeps the fact, its tags, its entities, the timestamp, the source IDs, and pointers back to the source objects, plus audit logs for every consolidation event. For the agent's reasoning, this is fine. For deep forensic recall, the trail stops at the consolidation boundary.

Memory must forget

The agent had learned something true, held it faithfully, and applied it to a world that had already moved on. It didn't fail. It remembered wrong, and that is harder to detect and harder to fix.

The instinct when this happens is to improve retrieval: richer representations, smarter reranking, better ranking signals at query time. Those are real improvements. They don't address the root problem. A memory that was accurate when written but is no longer accurate won't get better with a smarter search. Better retrieval doesn't fix stale memory. It just surfaces stale memory faster.

Forgetting is the mechanism that makes what remains trustworthy. What hasn't been accessed or reinforced decays below a threshold and is archived out of the live store. What survives that process is a more reliable signal precisely because it survived.

Hermann Ebbinghaus mapped the shape of this in 1885: memory decays exponentially with time, and each retrieval resets the clock. We extended his curve with an importance weight, so a frequently retrieved, high importance memory can persist for months while an unaccessed low importance one is archived within days.

Each memory carries this relevance score, a function of time since last access, access frequency, and an importance weight set at creation. Decay rates correspond to workflow cadence, whether the workflow runs daily, weekly, monthly, or yearly. Higher access counts and importance scores slow decay over time, so a memory that keeps proving useful earns a longer half-life. A memory that nobody retrieves and that isn't important enough to slow decay falls below the relevance threshold and is archived out of the live store.

Forgetting curve over a month for four memories with different fates

The agents that run on this system don't carry a growing pile of context. They carry what's still true. They improve over time by trusting what matters: knowledge that has crossed workflows, earned its promotion, and remains current.

What changes when memory is right

Storage that doesn't shrink balloons in dollars and watts as readily as it balloons in noise. Memory isn't read once. Each retrieval pulls it back into the agent's context window, every query, every session, and every unnecessary byte gets re-paid in tokens. Forgetting isn't only how the system stays true. It's how it stays affordable as use scales. Quality and cost move together, not against each other.

An agent whose memory consolidates, promotes, and forgets is doing something different. It's deciding, continuously, what's worth keeping. That's not a retrieval problem. It's not an infrastructure problem. It's a design question.

The race to scale memory infrastructure will produce agents with more context at retrieval time: larger, richer, faster. It won't produce agents whose memory is truer than last month's.

The question was never how much agents remember. It was whether what they remember is shared, trusted, and still true.

Share

Florian Valeye

Staff Backend Engineer

Data platform expert with deep expertise in distributed systems and modern data infrastructure. Delta Lake committer and contributor, previously scaled data pipelines at Back Market.