Tag
This paper formalizes the sufficiency gap in next-token prediction, demonstrating that even ideal sequence models can become overconfident when textual prefixes are not sufficient statistics for latent circumstances. It proposes an external observer mechanism to reduce but not eliminate this gap.