memory-module

#memory-module

Augmenting Attention with Exponentially Decaying Memory Improves Query-Aware KV Sparsity

Hugging Face Daily Papers ↗ · 2026-05-27 Cached

This paper explores how an exponentially decaying memory module from RAT+ can improve query-aware sparse inference methods for long-context language models, demonstrating consistent accuracy gains across various sparse budgets on needle-in-a-haystack tasks.

0 favorites 0 likes

#memory-module

Exact Linear Attention

arXiv cs.LG ↗ · 2026-05-20

This paper introduces Exact Linear Attention (ELA), a mechanism that achieves linear computational complexity for Transformer attention without approximation error by leveraging kernel decomposition, and addresses gradient explosion and token dilution through constrained kernel functions. It also presents engineering innovations including Hyper Link, Memory Lobe, and a routing bias for Mixture of Experts.

0 favorites 0 likes

#memory-module

NGM: A Plug-and-Play Training-Free Memory Module for LLMs

arXiv cs.AI ↗ · 2026-05-19 Cached

This paper presents NGM, a plug-and-play training-free memory module for LLMs that uses a Causal N-Gram Encoder and Cosine-Gated Memory Injector to improve performance on code generation and knowledge-intensive tasks without additional training.

0 favorites 0 likes

#memory-module

NGM: A Plug-and-Play Training-Free Memory Module for LLMs

Hugging Face Daily Papers ↗ · 2026-05-16 Cached

NGM is a training-free, plug-and-play memory module for LLMs that enhances performance by using pretrained token embeddings for N-gram knowledge retrieval without additional training or retrieval pipelines, achieving gains of up to 3 points on code generation and knowledge tasks.

0 favorites 0 likes

#memory-module

NanoResearch: Co-Evolving Skills, Memory, and Policy for Personalized Research Automation

Hugging Face Daily Papers ↗ · 2026-05-11 Cached

NanoResearch is a multi-agent framework designed to personalize research automation by co-evolving skills, memory, and policy to adapt to individual user preferences and research styles.

0 favorites 0 likes

memory-module

Augmenting Attention with Exponentially Decaying Memory Improves Query-Aware KV Sparsity

Exact Linear Attention

NGM: A Plug-and-Play Training-Free Memory Module for LLMs

NGM: A Plug-and-Play Training-Free Memory Module for LLMs

NanoResearch: Co-Evolving Skills, Memory, and Policy for Personalized Research Automation

Submit Feedback