hallucination-detection

#hallucination-detection

From Signals to Transfer: A Factorised Study of Probe-Based Uncertainty Estimation in Large Language Models

arXiv cs.CL ↗ · yesterday Cached

This paper presents a factorised study of probe-based uncertainty estimation in LLMs, showing that raw hidden states and attention features perform well in-domain but structured features are more robust under distribution shift, and provides pretrained probes as off-the-shelf baselines.

0 favorites 0 likes

#hallucination-detection

GAVEL: Grounded Caption Error Verification and Localization

arXiv cs.CL ↗ · 4d ago Cached

GAVEL introduces a new task for verifying, explaining, and localizing errors in image-text pairs, along with a dataset and benchmark. A supervised baseline shows improvements over strong closed-source models.

0 favorites 0 likes

#hallucination-detection

MedBench v5: A Dynamic, Process-Oriented, and Hallucination-Aware Benchmark for Clinical Multimodal Models

arXiv cs.CL ↗ · 6d ago Cached

MedBench v5 is a dynamic, process-oriented benchmark for clinical multimodal models that integrates hallucination detection and stress testing, moving beyond static QA to evaluate reasoning and stability under information-flow stressors.

0 favorites 0 likes

#hallucination-detection

Zero-source LLM Hallucination Detection with Human-like Criteria Probing

arXiv cs.AI ↗ · 2026-06-12 Cached

Proposes HCPD, a zero-source hallucination detection method that uses a human-like criteria probing mechanism to decompose judgments into interpretable criteria, outperforming state-of-the-art baselines.

0 favorites 0 likes

#hallucination-detection

Layer-Resolved Optimal Transport for Hallucination Detection in NMT and Abstractive Summarization

arXiv cs.CL ↗ · 2026-06-12 Cached

This paper extends optimal transport-based hallucination detection to all decoder layers in NMT and abstractive summarization, finding that detection is concentrated in early layers and that the geometric signal transfers poorly to summarization due to faithfulness failures not detectable via attention concentration.

0 favorites 0 likes

#hallucination-detection

Quickest Detection of Hallucination Onset: Delay Bounds and Learned CUSUM Statistics

Hugging Face Daily Papers ↗ · 2026-06-10 Cached

Reformulates token-level hallucination detection as a quickest change detection problem, establishing theoretical lower bounds on detection delay and showing that causal recurrent models achieve near-optimal performance, outperforming linear baselines.

0 favorites 0 likes

#hallucination-detection

Our ICML paper on predictable hallucination (information-budget abstention gate), + ntkMirror: a training-free open-weight implementation we're releasing today

Reddit r/LocalLLaMA ↗ · 2026-06-09

A paper accepted at ICML 2026 introduces predictable hallucination via an information-budget abstention gate, and releases ntkMirror, a training-free open-weight implementation that reduces hallucination by abstaining when information is insufficient, achieving 0.0–0.7% hallucination at ~24% abstention.

0 favorites 0 likes

#hallucination-detection

OpenHalDet: A Unified Benchmark for Hallucination Detection across Diverse Generation Scenarios

arXiv cs.CL ↗ · 2026-06-08 Cached

OpenHalDet is a unified benchmark for hallucination detection in LLMs, standardizing evaluation across diverse generation scenarios and supporting black-box, gray-box, and white-box detection methods.

0 favorites 0 likes

#hallucination-detection

Evidence Graph Consistency in Retrieval-Augmented Generation: A Model-Dependent Analysis of Hallucination Detection

arXiv cs.CL ↗ · 2026-06-08 Cached

Proposes Evidence Graph Consistency (EGC), a framework using graph-based structural consistency for hallucination detection in RAG, revealing that effectiveness varies across model families.

0 favorites 0 likes

#hallucination-detection

Whisper Hallucination Detection and Mitigation via Hidden Representation Steering and Sparse AutoEncoders

Hugging Face Daily Papers ↗ · 2026-06-05 Cached

This paper demonstrates that Whisper's hallucination failures on silence, noise, or music can be detected and mitigated purely from internal activations using sparse autoencoders, achieving large reductions in hallucination rate without fine-tuning.

0 favorites 0 likes

#hallucination-detection

Cascading Hallucination in Agentic RAG: The CHARM Framework for Detection and Mitigation

arXiv cs.AI ↗ · 2026-06-04 Cached

This paper introduces CHARM, a framework for detecting and mitigating cascading hallucinations in multi-step agentic RAG pipelines, where early-stage errors propagate and amplify across reasoning steps. CHARM achieves an 89.4% cascade detection rate and 82.1% error propagation reduction across multiple benchmarks with low latency overhead.

0 favorites 0 likes

#hallucination-detection

KG-Guard: Graph-Based Hallucination Detection for Knowledge Base Question Answering

arXiv cs.LG ↗ · 2026-06-02 Cached

KG-Guard is a lightweight graph-based framework for detecting hallucinations in LLM-based knowledge base question answering. It treats the LLM as a black box and uses a graph encoder with a MLP classifier to identify hallucinated answer nodes, outperforming baselines while having far fewer parameters.

0 favorites 0 likes

#hallucination-detection

FLaG: Fine-Grained Latent Grouping for Hallucination Detection

arXiv cs.LG ↗ · 2026-06-02 Cached

FLaG is a lightweight framework for hallucination detection in LLMs that models correctness via latent evidence groups and energy-based routing, achieving SOTA performance across benchmarks.

0 favorites 0 likes

#hallucination-detection

LLM-FACETS: A Privacy-Preserving Framework for Evaluating LLM Transparency and Accountability

arXiv cs.AI ↗ · 2026-06-01 Cached

LLM-FACETS is an open-source evaluation framework designed to help practitioners assess LLM transparency and accountability with a focus on privacy and data flow transparency. It provides a browser interface, plugin architecture, and supports multiple auditing mechanisms including token-level log-probability visualization and RAG Triad metrics.

0 favorites 0 likes

#hallucination-detection

Hallucination Detection-Guided Preference Optimization for Clinical Summarization

arXiv cs.CL ↗ · 2026-05-29 Cached

Introduces HDSR and HDSR-PL, methods that use hallucination detectors to guide iterative self-refinement and preference learning, achieving up to 48% reduction in hallucinations for clinical summarization using Llama and Gemma models on MIMIC-IV-Note.

0 favorites 0 likes

#hallucination-detection

Neuro-Symbolic Verification of LLM Outputs for Data-Sensitive Domains (extended preprint)

arXiv cs.AI ↗ · 2026-05-27 Cached

This paper presents a neuro-symbolic verification architecture for LLM outputs in high-stakes domains, combining formal symbolic methods with neural semantic analysis. Evaluated on a medical device damage assessment system, it achieves over 83% hallucination detection for structured entities and 30% reduction in report creation time.

0 favorites 0 likes

#hallucination-detection

Automatic Layer Selection for Hallucination Detection

arXiv cs.AI ↗ · 2026-05-27 Cached

This paper proposes automatic layer selection for hallucination detection in LLMs and introduces First Effective Peak of Intrinsic Dimension (FEPoID), a training-free criterion that consistently identifies optimal intermediate layers, outperforming existing heuristics.

0 favorites 0 likes

#hallucination-detection

Graph Alignment Topology as an Inductive Bias for Grounding Detection

arXiv cs.CL ↗ · 2026-05-25 Cached

This paper introduces Graph Alignment Topology as an inductive bias for grounding detection, using a graph neural network to model alignment structure between reference information and LLM outputs. The method achieves state-of-the-art results on multiple hallucination and question-answering datasets, outperforming GPT-4o.

0 favorites 0 likes

#hallucination-detection

PARALLAX: Separating Genuine Hallucination Detection from Benchmark Construction Artifacts

arXiv cs.CL ↗ · 2026-05-19 Cached

This paper reveals that much of the reported progress in LLM hallucination detection is due to benchmark construction artifacts, where ground-truth answers are embedded in prompts, allowing a simple text-similarity baseline to achieve near-perfect scores. Through a large-scale controlled evaluation, the authors show that most methods perform near chance under proper controls, except for supervised probes on upper-layer hidden states such as SAPLMA and their proposed DRIFT.

0 favorites 0 likes

#hallucination-detection

Uncertainty Quantification for Large Language Diffusion Models

arXiv cs.CL ↗ · 2026-05-15 Cached

This paper presents the first systematic study of uncertainty quantification (UQ) for Large Language Diffusion Models (LLDMs), proposing lightweight zero-shot uncertainty signals derived from the iterative denoising process and showing that LLDMs can achieve both fast inference and reliable hallucination detection with up to 100x lower computational overhead compared to sampling-based baselines.

0 favorites 0 likes

hallucination-detection

Submit Feedback