GGT-100K: Generative Ground Truth for Generalizable Real-World Image Restoration

Hugging Face Daily Papers 05/29/26, 12:00 AM Papers

Summary

This paper introduces GGT-100K, a dataset of 103,707 image pairs for real-world image restoration, generated by using multimodal foundation models like Nano-Banana-2 to produce high-quality targets from low-quality inputs. Experiments show the dataset improves the generalization of various image restoration models.

Real-world image restoration (IR) is bottlenecked by the scarcity of high-quality paired training data. Synthetic datasets are abundant but often fail to model real-world degradations, while real-world paired datasets are expensive and difficult to capture. As a result, IR models trained on these datasets show limited generalization in real-world scenarios. In this work, we propose Generative Ground Truth (GGT) by using generative multimodal foundation models (MFMs) to produce high-quality (HQ) targets from real-world low-quality (LQ) images. We first conduct a systematic evaluation of nine state-of-the-art MFMs, including Nano-Banana-2 and GPT-Image-2, on images of various scenes and degradation types. The results demonstrate that Nano-Banana-2 with VLM-based adaptive prompting shows the highest capability to synthesize perceptually realistic and content-faithful HQ targets, which can serve as the GGT for the LQ input. We then employ Nano-Banana-2 to build a GGT synthesis pipeline, which involves multi-stage quality control to ensure data reliability, and construct GGT-100K, an LQ-HQ paired dataset comprising 103,707 training pairs and covering diverse scenes and complex real-world degradations. A test set of 500 image pairs is also established. Extensive experiments show that GGT-100K consistently improves the real-world generalization of a wide range of IR models, with particularly strong benefits for finetuning generative models for IR tasks. Our results suggest that MFMs can serve as practical tools for restoration-oriented data generation, and GGT-100K is a useful resource to expand the generalization boundaries of real-world IR models.

Original Article

View Cached Full Text

Cached at: 06/01/26, 03:17 AM

Paper page - GGT-100K: Generative Ground Truth for Generalizable Real-World Image Restoration

Source: https://huggingface.co/papers/2605.31039

Abstract

Generative multimodal foundation models are used to create high-quality training data for image restoration, improving model generalization across diverse real-world scenarios.

Real-worldimage restoration(IR) is bottlenecked by the scarcity of high-quality paired training data.Synthetic datasetsare abundant but often fail to modelreal-world degradations, while real-world paired datasets are expensive and difficult to capture. As a result, IR models trained on these datasets show limited generalization in real-world scenarios. In this work, we proposeGenerative Ground Truth(GGT) by usinggenerative multimodal foundation models(MFMs) to produce high-quality (HQ) targets from real-world low-quality (LQ) images. We first conduct a systematic evaluation of nine state-of-the-art MFMs, includingNano-Banana-2and GPT-Image-2, on images of various scenes and degradation types. The results demonstrate thatNano-Banana-2withVLM-based adaptive promptingshows the highest capability to synthesize perceptually realistic and content-faithful HQ targets, which can serve as the GGT for the LQ input. We then employNano-Banana-2to build a GGT synthesis pipeline, which involvesmulti-stage quality controlto ensure data reliability, and construct GGT-100K, anLQ-HQ paired datasetcomprising 103,707 training pairs and covering diverse scenes and complexreal-world degradations. A test set of 500 image pairs is also established. Extensive experiments show that GGT-100K consistently improves the real-world generalization of a wide range of IR models, with particularly strong benefits for finetuning generative models for IR tasks. Our results suggest that MFMs can serve as practical tools for restoration-oriented data generation, and GGT-100K is a useful resource to expand the generalization boundaries of real-world IR models.

View arXiv page View PDF Project page GitHub9 Add to collection

Get this paper in your agent:

hf papers read 2605\.31039

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2605.31039 in a model README.md to link it from this page.

Datasets citing this paper1

#### VCLab-PolyU/GGT-100K Updatedabout 2 hours ago • 98

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2605.31039 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

GGT-100K: Generative Ground Truth for Generalizable Real-World Image Restoration

Paper page - GGT-100K: Generative Ground Truth for Generalizable Real-World Image Restoration

Abstract

Models citing this paper0

Datasets citing this paper1

Spaces citing this paper0

Collections including this paper0

Similar Articles

Building a European Multilingual Evaluation Dataset: The MMLU Localisation Project within the EMT Network

OpenMHC: Accelerating the Science of Wearable Foundation Models

Large Language Models for Citation Function Classification

BLAD: A Historically Contextualized, Multilingual Dataset of Bangladeshi Legal Acts (1799 to 2025)

Submit Feedback

Similar Articles

Building a European Multilingual Evaluation Dataset: The MMLU Localisation Project within the EMT Network

OpenMHC: Accelerating the Science of Wearable Foundation Models

Large Language Models for Citation Function Classification

BLAD: A Historically Contextualized, Multilingual Dataset of Bangladeshi Legal Acts (1799 to 2025)

@yaojingang: With the consent of our friend Ba Dao Liu, we are open-sourcing the dataset recently collected from major domestic AI platforms. The cleaned dataset, preprint paper, and first analysis report have been pushed to the GitHub repository (see comments for the link). This should be the latest and most comprehensive public GEO raw dataset for major domestic AI platforms, including Doubao, Dee...