How I wired a Graph DB on top of my vector store to scale 1K agents for 2 months, because vector search alone fails when user preferences change over time.
A detailed architectural guide for building long-running AI agents that handle changing user preferences over time by combining a vector store, graph DB, and temporal edges instead of overwriting data.
Most agentic memory patterns are naturally designed around short-lived chat sessions. The focus there is straightforward: track the active thread, keep a basic user profile, and reset the context once the conversation closes. But when you operate long-running AI agents in production over extended periods, the architectural needs completely change. These agents don't get reset. They work for weeks on end, hand off tasks between execution loops, and face a massive real-world hurdle: **facts change over time.** If a user uses Gmail today and switches to Outlook next month, the agent needs to track both. It has to know which one is current, exactly when the switch happened, and it cannot act like the old truth is still valid. Standard vector database similarity scores do not understand chronological decay or truth overrides. Memory in a long-running agent isn't a single database. It requires distinct layers running in parallel across multiple DB types. After dealing with this problem for a while, here is the 7-layer architecture I landed on to handle it: **1. Working Memory** The active per-turn scratchpad. I enforce a strict execution wall here so temporary reasoning or transient tokens never leak into long-term storage. **2. Conversation Memory** Immediate thread history, managed by a dynamic summarizer middleware before it crosses token context thresholds. **3. Episodic Memory** A time-indexed log of past runs, especially the failed ones. This gives the agent continuity of its own execution history so it doesn't repeat past mistakes. **4. Semantic Memory** Slow-changing, deterministic facts. I split this into a human-editable markdown file (for explicit user configurations) and an LLM-extracted graph. If they disagree, the human notebook explicitly wins. **5. Knowledge Graph** The relational structure. While semantic memory holds the raw facts, this layer maps the structural edges between entities. A vector store treats data like isolated islands; the graph connects them contextually. **6. Procedural Memory** Behavior and execution mechanics, not facts. This stores the specific habits, tool-use skills, and workflow patterns the agent reproduces across its automation loops. **7. Checkpoints** State snapshots. This is the difference between a pod crash starting a 40-minute multi-step task over from scratch, or resuming smoothly at minute 33. # The Core Breakthrough: Temporal Edges The biggest win was to **stop deleting or overwriting data** when preferences or environments change. Instead, every extracted fact in the semantic and graph layers needs a `valid_at` and `invalid_at` timestamp. When today’s session contradicts yesterday’s state, the pipeline invalidates the old edge instead of erasing it. This preserves a clean, immutable audit trail and allows the LLM to logically reason about *when* a preference or infrastructure shifted.
A new open-source tool called Writ uses a hybrid retrieval pipeline with BM25, ONNX vectors, and Neo4j graph traversals to provide context rules for AI coding agents, reducing token bloat by 726x and enforcing plan approval via bash hooks.
An open-source persistent memory layer for AI coding agents that stores and retrieves project decisions and context using Postgres and pgvector, aiming to reduce context window size and improve agent consistency.
The article discusses the concept of adding a network layer to AI agents, building on existing tools and vector stores to enable better coordination and communication among agents.
The author details their journey from a flat vector store to a graph database (FalkorDB) for AI memory, enabling multi-hop reasoning, temporal queries, and provenance tracking in their LocalClaw project.
The author explores two key challenges for AI coding agents: ensuring long-duration autonomous execution (hours) and designing agent-friendly architectures for local applications. They propose an explicit knowledge organization stage to manage messy context before planning and execution.