Tag
HRM-Text released a 1B-parameter base model, claiming it can be pretrained from scratch for only ~$1000, reducing compute and data volume by hundreds of times. It employs efficient techniques such as hierarchical recursive architecture, latent space reasoning, and PrefixLM packing. The paper and code are open-sourced.
HRM-Text is a 1B parameter text generation model that uses a brain-inspired hierarchical recurrent architecture to achieve efficient pretraining with only 40B tokens and ~$1000, enabling accessible foundation model training with dramatically reduced compute and data requirements.