Tag
This paper introduces ContextForge, a hierarchical memory architecture that treats the LLM context window as a recyclable workspace, achieving significant token and speed improvements on long-horizon tasks while maintaining accuracy on a 276-million-row enterprise benchmark.
The second edition of 'The Garbage Collection Handbook' updates the authoritative reference on automatic memory management, covering new hardware, software, and algorithm developments.
This article comprehensively reviews the complete architectural layering of AI Agent Memory as of mid-2026, including rule files, persistent profiles, historical recall, and evidence chains. It explains the storage methods, loading timings, and governance principles of different memory layers, emphasizing the key role of memory in helping agents achieve cross-session compounding work.
A game where you play as the operating system, managing processes, memory, and I/O events to keep the user satisfied.
A community discussion about how Hermes agents should handle persistent memory across sessions, exploring an external memory layer (8mem) and comparing memory-aware vs generic outputs.
After three months using Hermes Agent, the author shares insights on memory management and profile optimization, finding that less memory and fewer profiles lead to better results.
A thread sharing practical tips for freeing up GPU memory in llama.cpp, such as offloading mmproj to CPU and adjusting KV cache types, while discussing parameters like --cache-type-k/v and --spec-draft-n-max.
Nexus Memory is an MCP-native memory server that allows AI agents to share context via a unified protocol, enabling persistent and coordinated memory across different agents without custom integration.
GeneralVLA-2 introduces GeoFuse-MV3D for improved 3D reconstruction and a governed KnowledgeBank for better memory management in robotic manipulation tasks, achieving performance gains on several benchmarks.
OPD-Evolver proposes a self-evolving agent framework using slow-fast co-evolution and on-policy self-distillation to enhance memory management and policy learning, outperforming existing methods like ReasoningBank and Skill0 across multi-domain benchmarks.
A detailed 7-day guide for setting up the Hermes AI agent, covering identity, memory, tools, and Telegram integration.
Python 3.14 introduced an incremental garbage collector that was later reverted in 3.14.5 due to memory pressure reports. This article explains the changes, their impact, and the controversy around the revert.
G-Long proposes a graph-enhanced memory management framework for long-term dialogue agents, using a fine-tuned small language model for structured triplet extraction and associative retrieval, achieving state-of-the-art performance in response generation and memory retrieval with reduced computational overhead.
User seeks advice on preventing llama.cpp from offloading KV cache to swap before RAM is fully exhausted, sharing their configuration on an M2 Max with 96GB RAM and a large Qwen model.
Introduces Context Window Lifecycle (CWL), a structured context eviction scheme for long-horizon LLM agents that maintains an effectively unbounded working horizon by evicting content based on a dependency graph, avoiding the limitations of summarization-based compaction and recency truncation.
A detailed walkthrough of the Linux kernel's AF_UNIX garbage collector rewrite, explaining the background, the new graph-based model, and a Use-After-Free bug.
REAL is a reasoning-enhanced graph framework for long-term memory management of LLMs that uses temporal and confidence-aware directed property graphs with non-destructive temporal updates and hybrid beam search retrieval, achieving an average improvement of 22.72%.
The author questions whether the focus on expanding context windows for AI agents is counterproductive, arguing that accumulated junk slows down long sessions and suggests keeping working context small with external memory.
zalloc replaces malloc, calloc, realloc, and free in C modules with Zig allocators, enabling Zig-style memory management in C code.
A user questions the feasibility of an AI memory manager system that decides what to keep or forget based on importance, reinforcement, and decay.