Tag
This paper introduces geometric stability as a measure of how reliably pairwise stimulus distances reproduce across trials, demonstrating its behavioral relevance and circuit dependence across brain regions, with an attractor network model explaining its emergence.
The article presents a discovered spectral ratio between MLP and attention norms that predicts geometric stability in transformer models, with an optimal range of 0.5–2 to prevent rank collapse.