Tag
CRAFT is a unified counterfactual reasoning framework that improves tabular question answering and fact verification by constructing both original and counterfactual statements, extracting evidence from bidirectional reasoning paths, and integrating them via a weighted mechanism. Experiments show consistent improvements over baselines on WikiTQ and TabFact datasets.
This paper introduces CHARM, a framework for detecting and mitigating cascading hallucinations in multi-step agentic RAG pipelines, where early-stage errors propagate and amplify across reasoning steps. CHARM achieves an 89.4% cascade detection rate and 82.1% error propagation reduction across multiple benchmarks with low latency overhead.
This paper introduces SEEK, a framework for semantic evidence extraction in multilingual fact verification, which constructs coherent evidence chunks from full articles and fine-tunes multilingual LLMs with LoRA, achieving up to 20% improvement in macro-F1 over baselines.
The paper introduces NEI-CAP, a diagnostic protocol to evaluate how 'Not Enough Information' examples are constructed in fact verification benchmarks, revealing that models trained on shortcut-prone NEI constructions fail to transfer to harder, semantically related insufficient evidence cases.
This paper uses EEG recordings to study neural dynamics when humans process AI-generated hallucinated content, revealing distinct cognitive patterns and differences between misjudged and correctly judged hallucinations.
RADAR introduces a role-anchored multi-agent debate framework where Politician and Scientist agents adversarially reason over evidence to detect misleading half-truths, outperforming baselines on omission-aware fact verification.