state-space-models

#state-space-models

Sparse Prefix Caching for Hybrid and Recurrent LLM Serving

arXiv cs.LG ↗ · 2d ago Cached

This paper introduces sparse prefix caching for hybrid and recurrent LLMs, which stores recurrent states at a limited set of checkpoint positions to avoid dense caching while minimizing recomputation. The method outperforms standard heuristics on real-world data, especially when requests share substantial but non-identical prefixes.

0 favorites 1 likes

#state-space-models

@rshia_afz: 1/ SSMs struggle on recall benchmarks due to their fixed-size state. But are current models actually storing context “w…

X AI KOLs Timeline ↗ · 2d ago

The article introduces Raven, a new State Space Model (SSM) with selective memory allocation that achieves state-of-the-art performance on recall tasks and demonstrates superior length generalization compared to existing models like SWA.

0 favorites 0 likes

#state-space-models

New technique makes AI models leaner and faster while they’re still learning

MIT News — Artificial Intelligence ↗ · 2026-04-09 Cached

Researchers from MIT CSAIL and other institutions introduced CompreSSM, a technique that compresses state-space AI models during training by removing unnecessary components early, resulting in faster training and smaller models without sacrificing performance.

0 favorites 0 likes

state-space-models

Sparse Prefix Caching for Hybrid and Recurrent LLM Serving

@rshia_afz: 1/ SSMs struggle on recall benchmarks due to their fixed-size state. But are current models actually storing context “w…

New technique makes AI models leaner and faster while they’re still learning

Submit Feedback