Tag
This paper presents causal evidence that hallucination in autoregressive language models results from early trajectory commitment governed by asymmetric attractor dynamics, using same-prompt bifurcation and activation patching experiments on Qwen2.5-1.5B to show that hallucinated trajectories diverge at the first token and exhibit strong causal asymmetry across model layers.