My agent kept forgetting who 'Karpathy' was between sessions. Here's the architecture that fixed it

Reddit r/AI_Agents 05/20/26, 08:29 AM Tools

agent-memory knowledge-graph neo4j entity-resolution graphrag llm-agent vector-database

Summary

A developer shares an architecture using Neo4j knowledge graphs with typed entities and deduplication to solve the problem of AI agents forgetting entity identity across sessions, moving beyond flat files and vector stores.

I run a second brain on Obsidian, Readwise, NotebookLM, and Claude Code. For each topic, I build a scoped wiki structured as the LLM Knowledge Base Andrej Karpathy proposed. It fails to extract and maintain shared entities and facts as the knowledge base grows. If "Claude Code" appears in 10 documents, I can't unify it, rank it by frequency, or link it to Anthropic, Codex, and Gemini CLI once I'm past 50 documents. The problem is that file systems are append-only logs that fragment context, and vector indexes give fuzzy recall but no sense of identity, so there's no way to know if this is the same "Karpathy" entity you had yesterday. Knowledge-graph memory is the next step on the arc from RAG to agentic RAG to agent memory via GraphRAG, and a Neo4j repo I read for 2 days nails the pattern. Durable agent memory needs a structured graph that tracks identity, not just recall. Here is the architecture: 1. The repo has an SDK where natural language goes in on the write side and a fused memory context comes out on the read side, all anchored to 1 Neo4j graph. 2. The architecture uses 3 memory tiers within 1 graph, where short-term memory is a linear `:Message` sequence and long-term memory is a deduplicated typed `:Entity` graph. Reasoning memory is stored as a tree per agent run to store past successful or failed thinking patterns so the agent can one-shot future requests, which is similar to RL but at the database level. 3. The system follows a POLE+O ontology, which is a closed 5-type vocabulary consisting of Person, Object, Location, Event, and Organization. Every entity is exactly 1 type, materialized as multi-tier Neo4j labels, alongside `:Fact` nodes for generic claims and `:Preference` nodes that use a `SUPERSEDED_BY` relationship. 4. Extraction works as a speed-versus-accuracy ladder where spaCy handles fast NER and GLiNER/GLiREL do zero-shot extraction. The LLM stage fires only for real semantics and relationships, so cheap models clear high-confidence cases, and you don't pay LLM costs on every mention. 5. Resolution and deduplication are 2 different problems. Resolution canonicalizes names using fuzzy matching, while deduplication uses a vector score to decide if a new node is created. A false merge is silent and unrecoverable. A false split is noisy but recoverable. 6. A single Cypher query handles the entire retrieval by fusing vector similarity, multi-hop traversal, and conversation walks. This removes the need for cross-store joins or an external orchestrator, though context compression remains your responsibility. This repo is a blueprint, not a verdict. You can steal these patterns and ship them on Postgres or MongoDB to avoid running a graph database in production. I still use Neo4j for data mining, but the logic matters more after spending 2 days in the codebase. How are you handling agent memory today? Flat files, a vector index, a knowledge graph, or something stranger? **TL;DR:** Files and vector stores can recall text but can't track identity across sessions. A knowledge graph with typed entities and a formal dedup step is what turns recall into durable memory.

Original Article

My agent kept forgetting who 'Karpathy' was between sessions. Here's the architecture that fixed it

Similar Articles

@akshay_pachaar: Your agent remembers everything and understands nothing. Most agent memory systems optimize for recall. The harder prob…

I spent a year building agent memory on knowledge graphs. Here are the 5 mistakes that cost me months

@pauliusztin_: I researched how Cognee, Graphiti and agent-memory (by Neo4j) built their agent-memory solutions and compiled the whole…

Tired of onboarding your agent every session? Building a memory system to fix the problem? Here's a guide to some things you should be thinking about when designing your system.

@akshay_pachaar: https://x.com/akshay_pachaar/status/2058976178908885210

Submit Feedback

Similar Articles

@akshay_pachaar: Your agent remembers everything and understands nothing. Most agent memory systems optimize for recall. The harder prob…

I spent a year building agent memory on knowledge graphs. Here are the 5 mistakes that cost me months

@pauliusztin_: I researched how Cognee, Graphiti and agent-memory (by Neo4j) built their agent-memory solutions and compiled the whole…

Tired of onboarding your agent every session? Building a memory system to fix the problem? Here's a guide to some things you should be thinking about when designing your system.

@akshay_pachaar: https://x.com/akshay_pachaar/status/2058976178908885210