Our ICML paper on predictable hallucination (information-budget abstention gate), + ntkMirror: a training-free open-weight implementation we're releasing today
Summary
A paper accepted at ICML 2026 introduces predictable hallucination via an information-budget abstention gate, and releases ntkMirror, a training-free open-weight implementation that reduces hallucination by abstaining when information is insufficient, achieving 0.0–0.7% hallucination at ~24% abstention.
Similar Articles
Hallucination as an Anomaly: Dynamic Intervention via Probabilistic Circuits
This paper presents PCNet, a probabilistic circuit trained as a tractable density estimator on LLM residual streams to detect hallucinations as geometric anomalies. It also introduces PC-LDCD, a dynamic correction method that only intervenes on hallucinated tokens, achieving near-perfect detection and reduced corruption rates.
PARALLAX: Separating Genuine Hallucination Detection from Benchmark Construction Artifacts
This paper reveals that much of the reported progress in LLM hallucination detection is due to benchmark construction artifacts, where ground-truth answers are embedded in prompts, allowing a simple text-similarity baseline to achieve near-perfect scores. Through a large-scale controlled evaluation, the authors show that most methods perform near chance under proper controls, except for supervised probes on upper-layer hidden states such as SAPLMA and their proposed DRIFT.
Hallucination Mitigation with Agentic AI, Nested Learning, and AI Sustainability via Semantic Caching
This paper proposes a memory-augmented multi-agent architecture using nested learning, continuum memory systems, and semantic caching to mitigate hallucination in LLM pipelines, achieving significant reductions in factual errors while improving operational efficiency.
RAGognizer: Hallucination-Aware Fine-Tuning via Detection Head Integration
RAGognizer introduces a hallucination-aware fine-tuning approach that integrates a lightweight detection head into LLMs for joint optimization of language modeling and hallucination detection in RAG systems. The paper presents RAGognize, a dataset of naturally occurring closed-domain hallucinations with token-level annotations, and demonstrates state-of-the-art hallucination detection while reducing hallucination rates without degrading language quality.
Quickest Detection of Hallucination Onset: Delay Bounds and Learned CUSUM Statistics
Reformulates token-level hallucination detection as a quickest change detection problem, establishing theoretical lower bounds on detection delay and showing that causal recurrent models achieve near-optimal performance, outperforming linear baselines.