structured-recurrent-mixer

#structured-recurrent-mixer

Structured Recurrent Mixers for Massively Parallelized Sequence Generation

arXiv cs.CL ↗ · 2026-05-12 Cached

This paper introduces the Structured Recurrent Mixer (SRM), an architecture enabling algebraic conversion between parallel training and recurrent inference without specialized kernels. Experiments show SRMs achieve significantly higher throughput and concurrency compared to Transformers, with effective performance in reinforcement learning tasks.

0 favorites 0 likes

structured-recurrent-mixer

Structured Recurrent Mixers for Massively Parallelized Sequence Generation

Submit Feedback