hierarchical-recurrent-model

Tag

Cards List
#hierarchical-recurrent-model

HRM-Text: Efficient Pretraining Beyond Scaling

arXiv cs.CL · 2026-05-21 Cached

HRM-Text introduces a Hierarchical Recurrent Model that decouples computation into slow and fast layers, enabling efficient pretraining from scratch on only 40 billion tokens and a $1,500 budget, achieving competitive performance with larger models.

0 favorites 0 likes
← Back to home

Submit Feedback