Tag
This paper reveals that hallucination in large vision-language models is caused by a dynamic structural misalignment where certain attention heads act as risky mediators, decoupling from visual evidence to lock onto language priors. The authors propose Fox, a training-free causal intervention framework that diagnoses and physically severs these pathological shortcuts, achieving state-of-the-art performance in faithful decoding.
This paper demonstrates that the weight norm causally controls the timescale of grokking in neural networks, reconciling conflicting accounts. Through interventions, it shows that grokking follows an exponential delay law and that norm magnitude dominates grokking time over learning rate across architectures.
This paper identifies imbalanced attention head groups in MLLMs that drive or resist modality-conflict hallucination, and proposes MACI, a causal intervention that suppresses hallucination-driving heads only when conflict is detected, achieving large hallucination reduction across five models.