Mitigating Multimodal Hallucination via Phase-wise Self-reward

Hugging Face Daily Papers 04/20/26, 12:00 AM Papers

Summary

PSRD framework halves multimodal hallucination in LVLMs by using phase-wise self-reward decoding and a distilled lightweight reward model without extra supervision.

Large Vision-Language Models (LVLMs) still struggle with vision hallucination, where generated responses are inconsistent with the visual input. Existing methods either rely on large-scale annotated data for fine-tuning, which incurs massive computational overhead, or employ static post-hoc strategies that overlook the dynamic nature of hallucination emergence. To address these, we introduce a new self-rewarding framework, enabling dynamic hallucination mitigation at inference time without external supervision. On the empirical side, we reveal that visual hallucination exhibits phase-wise dynamic patterns, peaking at the onset of each semantic phase. Drawing on these insights, we propose PSRD (Phase-wise \textbf{Self-Reward Decoding) for online hallucination correction guided by phase-wise self-reward signals. To reduce the cost of repeated self-evaluation during decoding, we distill the hallucination guidance signal from LVLMs into a lightweight reward model. The reward model subsequently provides on-the-fly guidance for targeted intervention during the decoding process, enabling precise hallucination suppression. The proposed PSRD significantly reduces the hallucination rate of LLaVA-1.5-7B by 50.0% and consistently outperforms existing post-hoc methods across five hallucination evaluation benchmarks for four LVLMs. Further analysis confirms that PSRD effectively mitigates hallucination propagation and achieves a highly controllable trade-off between strong performance and inference efficiency.

Original Article Export to Word Export to PDF

View Cached Full Text

Cached at: 04/22/26, 10:35 AM

Paper page - Mitigating Multimodal Hallucination via Phase-wise Self-reward

Source: https://huggingface.co/papers/2604.17982

Abstract

A new self-rewarding framework called PSRD is introduced for dynamic hallucination mitigation in large vision-language models during inference, using phase-wise self-reward signals and a distilled lightweight reward model for efficient hallucination correction.

Large Vision-Language Models(LVLMs) still struggle withvision hallucination, where generated responses are inconsistent with the visual input. Existing methods either rely on large-scale annotated data for fine-tuning, which incurs massive computational overhead, or employ static post-hoc strategies that overlook the dynamic nature of hallucination emergence. To address these, we introduce a newself-rewarding framework, enabling dynamichallucination mitigationat inference time without external supervision. On the empirical side, we reveal that visual hallucination exhibits phase-wise dynamic patterns, peaking at the onset of each semantic phase. Drawing on these insights, we propose PSRD (Phase-wise \textbf{Self-Reward Decoding) for online hallucination correction guided by phase-wise self-reward signals. To reduce the cost of repeated self-evaluation during decoding, we distill thehallucination guidance signalfrom LVLMs into alightweight reward model. The reward model subsequently provides on-the-fly guidance for targeted intervention during the decoding process, enabling precise hallucination suppression. The proposed PSRD significantly reduces the hallucination rate of LLaVA-1.5-7B by 50.0% and consistently outperforms existing post-hoc methods across five hallucination evaluation benchmarks for four LVLMs. Further analysis confirms that PSRD effectively mitigateshallucination propagationand achieves a highly controllable trade-off between strong performance andinference efficiency.

View arXiv page View PDF Add to collection

Get this paper in your agent:

hf papers read 2604\.17982

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2604.17982 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2604.17982 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2604.17982 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

Mitigating Multimodal Hallucination via Phase-wise Self-reward

Paper page - Mitigating Multimodal Hallucination via Phase-wise Self-reward

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper0

Similar Articles

HalluSAE: Detecting Hallucinations in Large Language Models via Sparse Auto-Encoders

Hallucination as an Anomaly: Dynamic Intervention via Probabilistic Circuits

The First Token Knows: Single-Decode Confidence for Hallucination Detection

Mechanisms of Prompt-Induced Hallucination in Vision-Language Models

Mind the Unseen Mass: Unmasking LLM Hallucinations via Soft-Hybrid Alphabet Estimation

Submit Feedback

Similar Articles

HalluSAE: Detecting Hallucinations in Large Language Models via Sparse Auto-Encoders

Hallucination as an Anomaly: Dynamic Intervention via Probabilistic Circuits

The First Token Knows: Single-Decode Confidence for Hallucination Detection

Mechanisms of Prompt-Induced Hallucination in Vision-Language Models

Mind the Unseen Mass: Unmasking LLM Hallucinations via Soft-Hybrid Alphabet Estimation