attention-alternative

Tag

Cards List
#attention-alternative

An Update on Matrix Recurrent Units, an Attention Alternative [R]

Reddit r/MachineLearning · 2026-06-21

An update on Matrix Recurrent Units (MRU), a linear-time attention alternative. The author explores methods to stabilize training, finding that orthogonal matrices underperform while LDU factorization works best, and shows MRU underperforms transformers on larger datasets like TinyStories.

0 favorites 0 likes
← Back to home

Submit Feedback