δ-mem: Efficient Online Memory for Large Language Models

Hugging Face Daily Papers 05/12/26, 12:00 AM Papers

Summary

The paper introduces δ-mem, a lightweight memory mechanism that enhances large language models by augmenting a frozen attention backbone with a compact associative memory state. It demonstrates improved performance on memory-heavy benchmarks with minimal computational overhead.

Large language models increasingly need to accumulate and reuse historical information in long-term assistants and agent systems. Simply expanding the context window is costly and often fails to ensure effective context utilization. We propose δ-mem, a lightweight memory mechanism that augments a frozen full-attention backbone with a compact online state of associative memory. δ-mem compresses past information into a fixed-size state matrix updated by delta-rule learning, and uses its readout to generate low-rank corrections to the backbone's attention computation during generation. With only an 8times8 online memory state, δ-mem improves the average score to 1.10times that of the frozen backbone and 1.15times that of the strongest non-δ-mem memory baseline. It achieves larger gains on memory-heavy benchmarks, reaching 1.31times on MemoryAgentBench and 1.20times on LoCoMo, while largely preserving general capabilities. These results show that effective memory can be realized through a compact online state directly coupled with attention computation, without full fine-tuning, backbone replacement, or explicit context extension.

Original Article

View Cached Full Text

Cached at: 05/13/26, 04:11 AM

Paper page - δ-mem: Efficient Online Memory for Large Language Models

Source: https://huggingface.co/papers/2605.12357

Abstract

A lightweight memory mechanism called δ-mem enhances large language models by augmenting a frozen attention backbone with a compact associative memory state that provides low-rank corrections to attention computations.

Large language modelsincreasingly need to accumulate and reuse historical information in long-term assistants and agent systems. Simply expanding the context window is costly and often fails to ensure effective context utilization. We propose δ-mem, a lightweightmemory mechanismthat augments afrozen full-attention backbonewith a compact online state ofassociative memory. δ-mem compresses past information into a fixed-size state matrix updated bydelta-rule learning, and uses its readout to generate low-rank corrections to the backbone’sattention computationduring generation. With only an 8times8 online memory state, δ-mem improves the average score to 1.10times that of the frozen backbone and 1.15times that of the strongest non-δ-mem memory baseline. It achieves larger gains onmemory-heavy benchmarks, reaching 1.31times onMemoryAgentBenchand 1.20times onLoCoMo, while largely preserving general capabilities. These results show that effective memory can be realized through a compact online state directly coupled withattention computation, without full fine-tuning, backbone replacement, or explicit context extension.

View arXiv page View PDF GitHub26 Add to collection

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2605.12357 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2605.12357 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2605.12357 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

δ-mem: Efficient Online Memory for Large Language Models

Paper page - δ-mem: Efficient Online Memory for Large Language Models

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper0

Similar Articles

Δ-Mem: Efficient Online Memory for Large Language Models

@dair_ai: // δ-mem: Efficient Online Memory for LLMs // One of the more elegant memory mechanisms I've seen this month. Most long…

@dair_ai: // Memory as a Model // The paper augments any LLM with a separate trained memory model that stores, retrieves, and int…

StageMem: Lifecycle-Managed Memory for Language Models

SimpleMem: Efficient Lifelong Memory for LLM Agents

Submit Feedback

Similar Articles

Δ-Mem: Efficient Online Memory for Large Language Models

@dair_ai: // δ-mem: Efficient Online Memory for LLMs // One of the more elegant memory mechanisms I've seen this month. Most long…

@dair_ai: // Memory as a Model // The paper augments any LLM with a separate trained memory model that stores, retrieves, and int…

StageMem: Lifecycle-Managed Memory for Language Models

SimpleMem: Efficient Lifelong Memory for LLM Agents