New SOTA 1B model? HRM-text
Summary
HRM-text is a 1B-parameter hierarchical reasoning language model proposed by Sapient Intelligence. It thinks efficiently through internal latent space, achieving performance surpassing most models of the same size with extremely low training cost.
View Cached Full Text
Cached at: 05/19/26, 10:50 PM
Similar Articles
@Sapient_Int: Introducing HRM-Text. An ultra-lean 1B-parameter reasoning language model designed to deliver strong general performanc…
Sapient Intelligence introduces HRM-Text, a 1B-parameter reasoning language model trained on only 40B tokens with a budget of $1,000, achieving competitive performance while drastically reducing data and compute requirements.
sapientinc/HRM-Text-1B
Sapient Intelligence released HRM-Text-1B, a 1-billion-parameter language model with a novel dual-timescale recurrent architecture (Hierarchical Reasoning Model) that provides unbounded compute depth at bounded parameter count. The pre-alignment checkpoint is available on Hugging Face.
HRM Seems To Be Going Off Right Now
Sapient Intelligence has released HRM-Text, a 1B parameter text generation model, trained on only 0.04 trillion tokens (costing approximately $1000), surpassing much larger models trained on 100-1000 times more data on multiple reasoning benchmarks, marking the beginning of a new paradigm for AI training.
@vintcessun: Pretraining can be this cost-effective? Train a usable 1B base model from scratch for ~$1000, slashing compute and data by hundreds of times. The key isn't brute-force compute, but hierarchical recursive architecture plus latent space reasoning, combined with PrefixLM packing and FA3 to maximize efficiency. Sounds insane, but the paper and code are open-sourced.
HRM-Text released a 1B-parameter base model, claiming it can be pretrained from scratch for only ~$1000, reducing compute and data volume by hundreds of times. It employs efficient techniques such as hierarchical recursive architecture, latent space reasoning, and PrefixLM packing. The paper and code are open-sourced.
HRM-Text: Trained on only 1k$ and 40b tokens with brain inspired hierarchical latent architecture
HRM-Text is a 1B parameter text generation model that uses a brain-inspired hierarchical recurrent architecture to achieve efficient pretraining with only 40B tokens and ~$1000, enabling accessible foundation model training with dramatically reduced compute and data requirements.