variance-inference

#variance-inference

Variational Linear Attention: Stable Associative Memory for Long-Context Transformers

arXiv cs.LG ↗ · 12h ago Cached

This paper introduces Variational Linear Attention (VLA), a method that stabilizes memory states in linear attention mechanisms for long-context transformers. VLA reframes memory updates as an online regularized least-squares problem, proving bounded state norms and demonstrating significant speedups and improved retrieval accuracy over standard linear attention and DeltaNet.

0 favorites 0 likes

variance-inference

Variational Linear Attention: Stable Associative Memory for Long-Context Transformers

Submit Feedback