Tag
The author shares thoughts on making convergence a reliable halting signal for iterative weight-tied models, discussing tricks from papers like DEQ, Huggin, Ouro, and EqR, and highlighting the roles of pre-norm and input injection.
Equilibrium Reasoners (EqR) introduce a novel framework for scalable reasoning by learning task-conditioned attractors in latent dynamical systems, achieving over 99% accuracy on Sudoku-Extreme by unrolling up to 40,000 layers.