Our ICML paper on predictable hallucination (information-budget abstention gate), + ntkMirror: a training-free open-weight implementation we're releasing today

Reddit r/LocalLLaMA 06/09/26, 04:23 PM Papers

hallucination-detection abstention open-weights icml-paper ntkmirror order-sensitivity

Summary

A paper accepted at ICML 2026 introduces predictable hallucination via an information-budget abstention gate, and releases ntkMirror, a training-free open-weight implementation that reduces hallucination by abstaining when information is insufficient, achieving 0.0–0.7% hallucination at ~24% abstention.

Our paper, *Predictable Compression Failures: Order Sensitivity and Information Budgeting for Evidence-Grounded Binary Adjudication*, was accepted at ICML 2026. Paper: [https://arxiv.org/abs/2509.11208](https://arxiv.org/abs/2509.11208) **The idea:** in evidence-grounded QA, the order you present exchangeable evidence in changes the model's answer probability (permutation dispersion). We treat order as a nuisance variable, derive the Expectation-level Decompression Law (EDFL) relating expected information budget to achievable reliability, and turn it into a fixed ISR=1 answer/abstain gate with no threshold tuning. When information is insufficient, the model abstains instead of guessing. In the paper's pre-specified held-out audit, the gate reaches 0.0–0.7% hallucination at \~24% abstention (80.5% accuracy on attempts), with the ISR=1 boundary fixed by theory rather than tuned. **What we're releasing today (ntkMirror):** a training-free implementation of that gate for local open-weight models. It scores each claim under multiple evidence orderings (order-marginal verifier, exact tied-branch scoring), computes ISR from the per-permutation probabilities, and gates answer/abstain. No fine-tuning, no second model, runs on your own weights offline. We also ship a fused kernel that batches the permutation forwards: bit-identical to the naive loop at fp32, 2.6–10× faster. **New results (not in the paper):** run as a hallucination detector across small local models, AUROC on VitaminC / BoolQ / SciFact: |Model|VitaminC|BoolQ|SciFact| |:-|:-|:-|:-| |Qwen2.5-0.5B|0.78|0.69|0.80| |Qwen2.5-1.5B|0.69|0.78|0.91| |Gemma E4B|0.88|0.84|0.96| |Qwen2.5-7B|0.90|0.87|0.94| Separation scales with model size, strongest on SciFact and the larger models. Used as a gate on balanced data, the grounded fraction of accepted claims rises from 50% to roughly 75–90% depending on model/dataset, at the cost of dropping \~10–20% of valid claims. The kernel doesn't affect accuracy (AUROC gap ≤0.008); it just makes the gate cheap. Please let me know if you find it useful [https://github.com/leochlon/ntkmirror](https://github.com/leochlon/ntkmirror)

Original Article

Our ICML paper on predictable hallucination (information-budget abstention gate), + ntkMirror: a training-free open-weight implementation we're releasing today

Similar Articles

Hallucination as an Anomaly: Dynamic Intervention via Probabilistic Circuits

PARALLAX: Separating Genuine Hallucination Detection from Benchmark Construction Artifacts

Hallucination Mitigation with Agentic AI, Nested Learning, and AI Sustainability via Semantic Caching

RAGognizer: Hallucination-Aware Fine-Tuning via Detection Head Integration

Quickest Detection of Hallucination Onset: Delay Bounds and Learned CUSUM Statistics

Submit Feedback

Similar Articles

Hallucination as an Anomaly: Dynamic Intervention via Probabilistic Circuits

PARALLAX: Separating Genuine Hallucination Detection from Benchmark Construction Artifacts

Hallucination Mitigation with Agentic AI, Nested Learning, and AI Sustainability via Semantic Caching

RAGognizer: Hallucination-Aware Fine-Tuning via Detection Head Integration

Quickest Detection of Hallucination Onset: Delay Bounds and Learned CUSUM Statistics