Agent memory is not just RAG over user facts

Reddit r/AI_Agents 05/16/26, 07:15 AM News

agent-memory rag production ai-agents memory-management prompt-injection telemetry

Summary

The article argues that simple RAG-based agent memory systems fail in production due to issues like stale preferences, missed keywords, and prompt injection, and advocates for a layered memory architecture with active selection, deterministic fallback, governance, and testing.

I keep seeing agent memory implemented as: 1. Extract facts/preferences from conversation 2. Store them 3. Retrieve top-k before each response 4. Inject them into the prompt This works for demos, but it breaks in production because memory becomes policy once it enters the prompt. A stale preference can be true and still wrong for the current task. A follow-up question can omit the original task keyword. An edited memory can keep a stale embedding. A selector failure can accidentally lead to broad prompt injection. The pattern I’m arguing for: \- layered memory: evidence / scene / stable profile \- Active Memory selection before injection \- deterministic fallback, never full injection \- memory\_usage telemetry \- governance: edit, deprecate, merge, supersede \- janitor cleanup for memories that repeatedly pollute context \- scenario replay tests based on real traces Curious how others are handling “memory that is true but should not influence this turn.” I’ll put the full write-up in a comment to respect the subreddit rule about links.

Original Article

Agent memory is not just RAG over user facts

Similar Articles

Some notes and lessons on Agents, RAG and memory

rohitg00/agentmemory

Agentmemory

How AI agent memory works (28 minute read)

Building Agentic GraphRAG Systems: From knowledge graphs and ontologies to a unified memory as an MCP server for your AI agent.

Submit Feedback

Similar Articles

Some notes and lessons on Agents, RAG and memory

How AI agent memory works (28 minute read)

Building Agentic GraphRAG Systems: From knowledge graphs and ontologies to a unified memory as an MCP server for your AI agent.