Tag
This paper proposes H-Res, a method to adapt large transformer models by shaping the energy landscape of associative memories without modifying weights or adding prompts, preserving memory capacity and outperforming LoRA.
Introduces FERNme, an open-source memory layer for AI agents that uses a fuzzy Hebbian graph to simulate associative memory, supporting features like zero-LLM writes, persistence, forgetting, and user ownership.
This paper investigates whether the Engram module, an associative memory mechanism, provides content-addressable retrieval in autoregressive image generation or acts as a gated architectural side-pathway.
Tensor Cache introduces a two-level caching mechanism that compresses evicted key-value pairs from sliding-window attention into a fixed-size associative memory, improving long-context language modeling without unbounded memory growth.
This paper introduces Variational Linear Attention (VLA), a method that stabilizes memory states in linear attention mechanisms for long-context transformers. VLA reframes memory updates as an online regularized least-squares problem, proving bounded state norms and demonstrating significant speedups and improved retrieval accuracy over standard linear attention and DeltaNet.
HeLa-Mem is a bio-inspired memory architecture for LLM agents that models memory as a dynamic graph using Hebbian learning dynamics, featuring episodic and semantic memory stores to improve long-term coherence. Experiments on LoCoMo show superior performance across question categories while using fewer context tokens.