causal-analysis

Tag

Cards List
#causal-analysis

Hallucination as Trajectory Commitment: Causal Evidence for Asymmetric Attractor Dynamics in Transformer Generation

arXiv cs.CL · 2026-04-20 Cached

This paper presents causal evidence that hallucination in autoregressive language models results from early trajectory commitment governed by asymmetric attractor dynamics, using same-prompt bifurcation and activation patching experiments on Qwen2.5-1.5B to show that hallucinated trajectories diverge at the first token and exhibit strong causal asymmetry across model layers.

0 favorites 0 likes
← Back to home

Submit Feedback