Tag
STARE addresses policy entropy collapse in GRPO-based reinforcement learning for large language models by introducing surprisal-guided token-level advantage reweighting and target-entropy regulation, achieving 4%-8% accuracy gains on AIME benchmarks.
Introduces trajectory extrapolation error, a measure derived from transformer LM hidden states that predicts human reading times independently of and orthogonally to surprisal, revealing a dissociable component of incremental processing cost.
This paper tests the Parse Multiplicity Mismatch Hypothesis, proposing that language models underpredict human processing difficulty in garden path sentences because they can consider more simultaneous parses. Using RNNGs with beam search, they find reducing the number of active parses increases predicted garden path effects, but not enough to fully capture human data.