Tag
Explains why production AI agents require a three-tier persistent memory stack—session logs (Zep), personalization (Mem0), and governed context (ContextNest)—to avoid retrieval of stale or conflicting facts.
This paper studies how memory architecture affects language emergence in LLM agents playing a Lewis signaling game, finding that persistent private notebook memory outperforms stateless agents and prevents high-capacity collapse.
This article delves into the memory architecture of AI agents, using the Hermes four-layer memory model as an example, analyzes the limitations of the Claude Code memory system, and introduces how to add local memory capabilities to agents through open-source frameworks (such as EverOS), providing detailed deployment steps.
This paper proposes ReverieMem, a three-layer memory architecture for book-based LLM role-playing agents that prevents factual overreach and stylistic monotony. It also introduces the KBF-QA benchmark and achieves significant improvements in knowledge boundary fidelity and narrative quality.
T-Mem is a new long-term conversational memory architecture that enables both descriptive and associative recall, covering scenarios where query and memory share surface features and those where they are connected by latent semantic arcs. It reaches state-of-the-art on the LoCoMo and LoCoMo-Plus benchmarks.
Xiaomi open-sourced MiMo Code, an AI coding assistant with a novel memory architecture that outperforms Claude Code on long-horizon tasks, and includes free access to its MiMo-V2.5 model.
This paper proposes Orthogonal Subspace Carving (OSC), a novel memory architecture that enables deep recursive binding in a constant memory footprint by projecting fillers onto the null space of role bases, overcoming the exponential scaling of Tensor Product Representations.
This article analyzes GitHub Copilot's memory architecture, which uses structured memory objects anchored to specific code citations and employs just-in-time verification to combat knowledge staleness. With memory enabled, Copilot's pull request merge rate improved from 83% to 90% in an A/B test on real developers.
Researchers from University of Toronto and Vector Institute propose Segment Tree Memory (SegTreeMem), a memory architecture for long-horizon conversational agents that preserves temporal order using a hierarchical segment tree structure for both online construction and retrieval. Experiments across three datasets show nearly 20% improvement in LLM-judge accuracy over non-temporal tree baselines.
A comprehensive breakdown of memory architecture in major AI agent platforms (Claude Code, OpenAI Codex, Copilot, Windsurf, Devin, etc.), discussing how memory is managed, current shortcomings, and future directions.
Eywa is a provenance-grounded long-term memory architecture for AI agents that stores immutable source evidence, validates extracted memories, and achieves strong benchmark results on LoCoMo, LongMemEval-S, and BEAM.
A detailed architectural guide for building long-running AI agents that handle changing user preferences over time by combining a vector store, graph DB, and temporal edges instead of overwriting data.
A developer shares Helix-AGI, a continuously-running cognitive agent using a physics-based memory retrieval system that integrates recency, structural importance, and semantic proximity via an entropic gravity equation and Euler-Lagrange dynamics, without tuning separate weights.
Introduces Stratum, a system-hardware co-design approach utilizing 3D-stackable DRAM to efficiently accelerate Mixture of Experts (MoE) models.
OpenSquilla has launched an open-source AI agent runtime designed to reduce token costs through intelligent routing, caching, and a four-tier memory architecture, claiming 60-80% cost savings.
EvolveMem introduces a self-evolving memory architecture for LLM agents that optimizes retrieval configurations through LLM-powered diagnosis and iterative research cycles, achieving significant performance improvements on benchmarks like LoCoMo and MemBench.
Lyzr Cognis introduces a unified, open-source memory system for conversational AI that fuses BM25 and Matryoshka vector search with version-aware ingestion, achieving SOTA on LoCoMo and LongMemEval benchmarks.