ReMMD: Realistic Multilingual Multi-Image Agentic Verification for Multimodal Misinformation Detection

Hugging Face Daily Papers 06/23/26, 12:00 AM Papers

multimodal misinformation-detection agentic-verification multilingual multi-image benchmark framework

Summary

ReMMD introduces a realistic multilingual multi-image agentic verification framework for multimodal misinformation detection, including a benchmark (ReMMDBench) with 500 samples and 2,756 images, and an agent (ReMMD-Agent) that achieves superior veracity performance with reduced costs.

Multimodal misinformation detection is increasingly important because viral posts now combine long multilingual narratives, several images, mixed provenance, and subtle text--image framing errors. Existing benchmarks and methods remain poorly matched to this setting: they usually isolate short captions, single images, binary labels, or one manipulation source, while agentic verification remains costly under realistic evidence search. We present ReMMD, a realistic multilingual multi-image agentic verification framework for multimodal misinformation detection. ReMMD includes ReMMDBench, a real-world multimodal misinformation detection benchmark with 500 samples, 2,756 images, five monolingual languages, two cross-lingual settings, three text-length tiers, multi-image posts, five-way veracity labels, eight distortion labels, evidence provenance, and rationales. It also includes ReMMD-Agent, a persistent-memory verifier that decomposes posts into atomic points, builds a reusable evidence set, and predicts structured L1/L2/L3 outputs. Across proprietary systems, open LVLMs, MMD-Agent, and T2-Agent, ReMMD-Agent obtains the best five-way veracity performance, with 41.80% accuracy and 39.12% macro-F1 using GPT-5.2, while reducing cost by 17.5% relative to MMD-Agent and 79.9% relative to T2-Agent. The project is available at https://dang-ai.github.io/ReMMD.

Original Article

View Cached Full Text

Cached at: 06/24/26, 05:46 AM

Paper page - ReMMD: Realistic Multilingual Multi-Image Agentic Verification for Multimodal Misinformation Detection

Source: https://huggingface.co/papers/2606.24112

Abstract

A comprehensive multimodal misinformation detection framework is introduced that handles complex, multilingual content with multiple images and diverse verification approaches, achieving superior performance while reducing computational costs.

Multimodal misinformation detectionis increasingly important because viral posts now combine long multilingual narratives, several images, mixed provenance, and subtle text--image framing errors. Existing benchmarks and methods remain poorly matched to this setting: they usually isolate short captions, single images, binary labels, or one manipulation source, whileagentic verificationremains costly under realistic evidence search. We present ReMMD, a realistic multilingual multi-imageagentic verificationframework formultimodal misinformation detection. ReMMD includesReMMDBench, a real-worldmultimodal misinformation detectionbenchmark with 500 samples, 2,756 images, five monolingual languages, two cross-lingual settings, three text-length tiers, multi-image posts, five-wayveracity labels, eightdistortion labels,evidence provenance, and rationales. It also includesReMMD-Agent, a persistent-memory verifier that decomposes posts into atomic points, builds a reusable evidence set, and predictsstructured L1/L2/L3 outputs. Across proprietary systems, openLVLMs, MMD-Agent, and T2-Agent,ReMMD-Agentobtains the best five-way veracity performance, with 41.80% accuracy and 39.12% macro-F1 usingGPT-5.2, while reducing cost by 17.5% relative to MMD-Agent and 79.9% relative to T2-Agent. The project is available at https://dang-ai.github.io/ReMMD.

View arXiv page View PDF Project page GitHub0 Add to collection

Get this paper in your agent:

hf papers read 2606\.24112

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2606.24112 in a model README.md to link it from this page.

Datasets citing this paper1

#### DDAI-D/ReMMDBench Updatedabout 3 hours ago • 5 • 1

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2606.24112 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

ReMMD: Realistic Multilingual Multi-Image Agentic Verification for Multimodal Misinformation Detection

Paper page - ReMMD: Realistic Multilingual Multi-Image Agentic Verification for Multimodal Misinformation Detection

Abstract

Models citing this paper0

Datasets citing this paper1

Spaces citing this paper0

Collections including this paper0

Similar Articles

SynCred-Bench: Benchmarking Synthetic Credibility in AI-Generated Visual Misinformation

MemEye: A Visual-Centric Evaluation Framework for Multimodal Agent Memory

Reinforcing Multimodal Reasoning Against Visual Degradation

InternVideo3: Agentify Foundation Models with Multimodal Contextual Reasoning

MARDoc: A Memory-Aware Refinement Agent Framework for Multimodal Long Document QA

Submit Feedback

Similar Articles

SynCred-Bench: Benchmarking Synthetic Credibility in AI-Generated Visual Misinformation

MemEye: A Visual-Centric Evaluation Framework for Multimodal Agent Memory

Reinforcing Multimodal Reasoning Against Visual Degradation

InternVideo3: Agentify Foundation Models with Multimodal Contextual Reasoning

MARDoc: A Memory-Aware Refinement Agent Framework for Multimodal Long Document QA