Tag
This thread argues that standard transformers have a topological flaw: once a state representation reaches the top layer, they cannot update beliefs over time, causing collapse as depth increases.
This paper compares xLSTM, Mamba-2, and Gated DeltaNet on complex sequence modeling tasks and finds xLSTM superior due to its enhanced state tracking and memory dynamics, validated on synthetic length-generalization tasks.
Introduces VSTAT, a new benchmark to measure how well multimodal LLMs track states in videos, revealing that frontier models struggle with tasks humans find easy.
This paper demonstrates that extended chain-of-thought reasoning degrades performance on deterministic state-tracking tasks due to information-theoretic limits of decoder-only attention, and proposes tool delegation when the reasoning horizon exceeds a threshold.
The article discusses that the main challenge for AI agents in real-world workflows is not understanding the task, but handling recovery from unexpected changes, state tracking, and knowing when to ask for human input.
The author observes that the hardest part of phone-use AI agents is tracking state changes, as mobile interfaces have more dynamic and interruptive UI changes compared to desktop, and asks for others' experience.
This paper argues that robust state tracking in recurrent models depends on error control dynamics rather than just expressive capacity, proving that affine recurrent networks suffer from accumulating errors that limit their effective horizon.