@jiqizhixin: New from NVIDIA! You can edit a model’s compressed memory without scrambling what it already knows! Enter Gated DeltaNe…
Summary
NVIDIA introduces Gated DeltaNet-2, a method for editing compressed model memory without catastrophic forgetting, using independent gates for erase and write operations. It outperforms existing models like Mamba-2 and Mamba-3 on language modeling and long-context tasks.
View Cached Full Text
Cached at: 05/22/26, 09:48 AM
New from NVIDIA!
You can edit a model’s compressed memory without scrambling what it already knows!
Enter Gated DeltaNet-2.
It separates the erase and write operations in linear attention using two independent gates – one for forgetting old info, another for adding new info.
Outperforms Mamba-2, Gated DeltaNet, KDA, and Mamba-3 across language modeling, commonsense reasoning, and retrieval – especially on long-context needle-in-a-haystack benchmarks.
Similar Articles
Δ-Mem: Efficient Online Memory for Large Language Models
Proposes delta-Mem, a lightweight online memory mechanism that uses a compact state matrix updated by delta-rule learning to improve long-context performance of frozen LLMs without full fine-tuning or context extension.
δ-mem: Efficient Online Memory for Large Language Models
The paper introduces δ-mem, a lightweight memory mechanism that enhances large language models by augmenting a frozen attention backbone with a compact associative memory state. It demonstrates improved performance on memory-heavy benchmarks with minimal computational overhead.
@tom_doerr: Compresses deep learning models for faster inference https://github.com/NVIDIA/Model-Optimizer…
NVIDIA Model Optimizer is a library that compresses deep learning models using techniques like quantization, distillation, pruning, and speculative decoding to accelerate inference. It supports Hugging Face, PyTorch, and ONNX models and integrates with NVIDIA inference frameworks.
@BlinkDL_AI: Gated DeltaNet-2 is almost exactly RWKV-7's DPLR recurrence, not acknowledging the elephant in the room
Ali Hatamizadeh announces Gated DeltaNet-2, a new linear attention model that outperforms KDA and Mamba-3 at 1.3B scale; @BlinkDL_AI notes its recurrence is nearly identical to RWKV-7's DPLR.
@dair_ai: // δ-mem: Efficient Online Memory for LLMs // One of the more elegant memory mechanisms I've seen this month. Most long…
The paper introduces δ-mem, a lightweight online memory mechanism that augments frozen LLMs with a compact associative memory state updated by delta-rule learning, achieving significant improvements on memory-heavy benchmarks without fine-tuning or context extension.