Quickest Detection of Hallucination Onset: Delay Bounds and Learned CUSUM Statistics

Hugging Face Daily Papers 06/10/26, 06:10 AM Papers

Summary

Reformulates token-level hallucination detection as a quickest change detection problem, establishing theoretical lower bounds on detection delay and showing that causal recurrent models achieve near-optimal performance, outperforming linear baselines.

Token-level hallucination detectors are evaluated as classifiers, by AUC over all tokens, yet a streaming monitor is judged by its reaction time: the number of tokens that pass between the onset of a hallucination and the alarm. We formulate hallucination onset detection as a quickest change detection problem. A first-order Markov model of the latent faithful/hallucinated state, validated on RAGTruth, places the task inside classical change-point theory and yields Lorden's lower bound on detection delay: about 1.3 tokens at a false-alarm rate of 0.01. We then show that a causal recurrent labeler acts as a CUSUM with a learned increment; at a matched false-alarm rate it detects in 11-13 tokens, against 31 for a linear per-token baseline, and a controlled decomposition attributes most of this advantage to a better per-token score rather than to temporal accumulation. An information-rate optimality theorem of Donsker-Varadhan type explains the remaining order-of-magnitude gap: the learned score realizes only 1/4.5 of the divergence the features carry, a deficit that recalibration cannot remove, with the remainder a finite-horizon effect. Classification metrics conceal this delay structure; sequential analysis makes it measurable

Original Article

View Cached Full Text

Cached at: 06/15/26, 04:59 PM

Paper page - Quickest Detection of Hallucination Onset: Delay Bounds and Learned CUSUM Statistics

Source: https://huggingface.co/papers/2606.12476

Abstract

Token-level hallucination detection is reformulated as a quickest change detection problem, revealing fundamental limits on detection delay and demonstrating superior performance through causal recurrent modeling.

Token-level hallucination detectors are evaluated as classifiers, by AUC over all tokens, yet a streaming monitor is judged by its reaction time: the number of tokens that pass between the onset of a hallucination and the alarm. We formulatehallucination onset detectionas aquickest change detectionproblem. A first-orderMarkov modelof the latent faithful/hallucinated state, validated on RAGTruth, places the task inside classicalchange-point theoryand yieldsLorden’s lower boundondetection delay: about 1.3 tokens at afalse-alarm rateof 0.01. We then show that a causalrecurrent labeleracts as aCUSUMwith a learned increment; at a matchedfalse-alarm rateit detects in 11-13 tokens, against 31 for a linear per-token baseline, and a controlled decomposition attributes most of this advantage to a better per-token score rather than to temporal accumulation. Aninformation-rate optimalitytheorem ofDonsker-Varadhan typeexplains the remaining order-of-magnitude gap: the learned score realizes only 1/4.5 of the divergence the features carry, a deficit that recalibration cannot remove, with the remainder a finite-horizon effect. Classification metrics conceal this delay structure; sequential analysis makes it measurable

View arXiv page View PDF GitHub0 Add to collection

Get this paper in your agent:

hf papers read 2606\.12476

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2606.12476 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2606.12476 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2606.12476 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

Quickest Detection of Hallucination Onset: Delay Bounds and Learned CUSUM Statistics

Paper page - Quickest Detection of Hallucination Onset: Delay Bounds and Learned CUSUM Statistics

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper0

Similar Articles

PARALLAX: Separating Genuine Hallucination Detection from Benchmark Construction Artifacts

Hallucination as an Anomaly: Dynamic Intervention via Probabilistic Circuits

Max-pooling Network Revisited: Analyzing the Role of Semantic Probability in Multiple Instance Learning for Hallucination Detection

Automatic Layer Selection for Hallucination Detection

Zero-source LLM Hallucination Detection with Human-like Criteria Probing

Submit Feedback

Similar Articles

PARALLAX: Separating Genuine Hallucination Detection from Benchmark Construction Artifacts

Hallucination as an Anomaly: Dynamic Intervention via Probabilistic Circuits

Max-pooling Network Revisited: Analyzing the Role of Semantic Probability in Multiple Instance Learning for Hallucination Detection

Automatic Layer Selection for Hallucination Detection

Zero-source LLM Hallucination Detection with Human-like Criteria Probing