Tag
This paper dissociates difficulty registration from deliberation allocation in large reasoning models (LRMs) and humans, finding that LRMs spend more tokens on problems they get wrong while humans spend less time on failures, revealing opposite within-item patterns despite similar cross-item difficulty correlations.