Mitigating Multimodal Hallucination via Phase-wise Self-reward
Summary
PSRD framework halves multimodal hallucination in LVLMs by using phase-wise self-reward decoding and a distilled lightweight reward model without extra supervision.
View Cached Full Text
Cached at: 04/22/26, 10:35 AM
Paper page - Mitigating Multimodal Hallucination via Phase-wise Self-reward
Source: https://huggingface.co/papers/2604.17982
Abstract
A new self-rewarding framework called PSRD is introduced for dynamic hallucination mitigation in large vision-language models during inference, using phase-wise self-reward signals and a distilled lightweight reward model for efficient hallucination correction.
Large Vision-Language Models(LVLMs) still struggle withvision hallucination, where generated responses are inconsistent with the visual input. Existing methods either rely on large-scale annotated data for fine-tuning, which incurs massive computational overhead, or employ static post-hoc strategies that overlook the dynamic nature of hallucination emergence. To address these, we introduce a newself-rewarding framework, enabling dynamichallucination mitigationat inference time without external supervision. On the empirical side, we reveal that visual hallucination exhibits phase-wise dynamic patterns, peaking at the onset of each semantic phase. Drawing on these insights, we propose PSRD (Phase-wise \textbf{Self-Reward Decoding) for online hallucination correction guided by phase-wise self-reward signals. To reduce the cost of repeated self-evaluation during decoding, we distill thehallucination guidance signalfrom LVLMs into alightweight reward model. The reward model subsequently provides on-the-fly guidance for targeted intervention during the decoding process, enabling precise hallucination suppression. The proposed PSRD significantly reduces the hallucination rate of LLaVA-1.5-7B by 50.0% and consistently outperforms existing post-hoc methods across five hallucination evaluation benchmarks for four LVLMs. Further analysis confirms that PSRD effectively mitigateshallucination propagationand achieves a highly controllable trade-off between strong performance andinference efficiency.
View arXiv pageView PDFAdd to collection
Get this paper in your agent:
hf papers read 2604\.17982
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2604.17982 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2604.17982 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2604.17982 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
Mitigating Manifold Departure: Uncertainty-Aware Subspace Rectification for Trustworthy MLLM Decoding
This paper introduces MGAP, a training-free decoding method that reduces hallucinations in Multimodal Large Language Models by adaptively suppressing only the harmful parts of language priors while preserving the model's semantic manifold. The method outperforms prior baselines on POPE and CHAIR benchmarks.
Hallucination Detection-Guided Preference Optimization for Clinical Summarization
Introduces HDSR and HDSR-PL, methods that use hallucination detectors to guide iterative self-refinement and preference learning, achieving up to 48% reduction in hallucinations for clinical summarization using Llama and Gemma models on MIMIC-IV-Note.
Breaking the Illusion: When Positive Meets Negative in Multimodal Decoding
This paper introduces Positive-and-Negative Decoding (PND), a training-free inference framework that reduces object hallucination in Vision-Language Models by contrasting positive visual evidence with negative counterfactuals during decoding.
Max-pooling Network Revisited: Analyzing the Role of Semantic Probability in Multiple Instance Learning for Hallucination Detection
This paper analyzes hallucination detection in LLMs, proposing a max-pooling approach that improves efficiency by eliminating costly semantic consistency computations while maintaining competitive performance.
ClinHallu: A Benchmark for Diagnosing Stage-Wise Hallucinations in Medical MLLM Reasoning
ClinHallu is a benchmark for diagnosing and mitigating hallucinations in medical multimodal large language models by decomposing reasoning into visual recognition, knowledge recall, and reasoning integration stages, using trace-supervised fine-tuning to reduce errors.