recurrent-attention

Tag

Cards List
#recurrent-attention

Augmenting Attention with Exponentially Decaying Memory Improves Query-Aware KV Sparsity

Hugging Face Daily Papers · 2026-05-27 Cached

This paper explores how an exponentially decaying memory module from RAT+ can improve query-aware sparse inference methods for long-context language models, demonstrating consistent accuracy gains across various sparse budgets on needle-in-a-haystack tasks.

0 favorites 0 likes
← Back to home

Submit Feedback