Tag
This paper studies a lightweight prompting-based recovery approach for LLM dialogue agents when backend database calls fail, showing that the Guided-Retry strategy reduces hallucination by 50% on MultiWOZ and 42% on SGD across six model families.
This paper reveals that hallucination in large vision-language models is caused by a dynamic structural misalignment where certain attention heads act as risky mediators, decoupling from visual evidence to lock onto language priors. The authors propose Fox, a training-free causal intervention framework that diagnoses and physically severs these pathological shortcuts, achieving state-of-the-art performance in faithful decoding.
CaVe-VLM-CoT is a modular reflection-based agentic-RAG framework for vision-language models that enforces evidence-grounded reasoning through a five-stage pipeline, achieving 87.1% accuracy on ScienceQA and proposing a suite of 23 metrics for evaluation.
Introduces MODE-RAG, a multi-agent system using Variational Free Energy and Monte Carlo Tree Search to dynamically gate interventions for mitigating hallucinations in Multimodal Retrieval-Augmented Generation systems, along with the ModeVent evaluation dataset.
This paper proposes a multi-agent 'Trust but Verify' system to reduce medical hallucinations in LLMs. It tests three open-access models on clinical questions about banned drugs and achieves a 53% reduction in hallucination error rate.
Discusses various methods to optimize DiffusionGemma inference, reduce hallucination, and improve performance for tool use and agents, including entropy-bounded sampling, schema scaffolding, and retrieval during denoising.
This paper proposes NTS-CoT, a novel framework that uses Chain-of-Thought reasoning to mitigate hallucinations in LLM-based news timeline summarization. It introduces three modules—Element-CoT, Date Selection, and Causal-CoT—to improve faithfulness and reduce omissions, outperforming state-of-the-art baselines on three benchmarks.
This paper introduces MGAP, a training-free decoding method that reduces hallucinations in Multimodal Large Language Models by adaptively suppressing only the harmful parts of language priors while preserving the model's semantic manifold. The method outperforms prior baselines on POPE and CHAIR benchmarks.
This paper demonstrates that Whisper's hallucination failures on silence, noise, or music can be detected and mitigated purely from internal activations using sparse autoencoders, achieving large reductions in hallucination rate without fine-tuning.
TIGER is an inference-time framework that mitigates hallucinations in multimodal generation by extracting observation and claim graphs and assigning risk scores to repair unsupported facts. It reduces unsupported content across image-to-text, image+text-to-text, audio-to-text, and video-to-text tasks.
MeasHalu is a novel framework for mitigating scientific measurement hallucinations in LLMs through a two-stage reasoning-aware fine-tuning strategy and progressive reward curriculum. It introduces a fine-grained taxonomy of measurement-specific hallucinations and demonstrates improved accuracy on the MeasEval benchmark.
This paper introduces Attention-Shifting (AS), a novel framework for selective machine unlearning in LLMs that balances effective removal of sensitive information while preventing hallucinations and preserving model utility. The method uses importance-aware attention suppression and retention enhancement to achieve up to 15% higher accuracy preservation compared to existing unlearning approaches on standard benchmarks.
FineSteer is a novel inference-time steering framework that decomposes steering into conditional steering and fine-grained vector synthesis stages, using Subspace-guided Conditional Steering (SCS) and Mixture-of-Steering-Experts (MoSE) mechanisms to improve safety and truthfulness while preserving model utility. Experiments show 7.6% improvement over state-of-the-art methods on TruthfulQA with minimal utility loss.
PSRD framework halves multimodal hallucination in LVLMs by using phase-wise self-reward decoding and a distilled lightweight reward model without extra supervision.