recurrent-models

#recurrent-models

Generic Triple-Latent Compression with Gated Associative Retrieval

arXiv cs.CL ↗ · 4d ago Cached

This paper introduces generic triple-latent recurrent models that compress token pair interactions into a latent state, and a gated associative retrieval variant that improves exact recall. The hybrid model outperforms Transformers on byte-level WikiText-2 and a tokenized language benchmark, achieving up to 41.9% associative recall versus 25%.

0 favorites 0 likes

#recurrent-models

WriteSAE: Sparse Autoencoders for Recurrent State

Hugging Face Daily Papers ↗ · 2026-05-12 Cached

WriteSAE introduces the first sparse autoencoder that decomposes matrix cache writes in state-space and hybrid recurrent language models, enabling superior token-level interventions compared to existing methods.

0 favorites 0 likes

#recurrent-models

Rethinking State Tracking in Recurrent Models Through Error Control Dynamics

Hugging Face Daily Papers ↗ · 2026-05-08 Cached

This paper argues that robust state tracking in recurrent models depends on error control dynamics rather than just expressive capacity, proving that affine recurrent networks suffer from accumulating errors that limit their effective horizon.

0 favorites 0 likes

recurrent-models

Generic Triple-Latent Compression with Gated Associative Retrieval

WriteSAE: Sparse Autoencoders for Recurrent State

Rethinking State Tracking in Recurrent Models Through Error Control Dynamics

Submit Feedback