GenEvolve: Self-Evolving Image Generation Agents via Tool-Orchestrated Visual Experience Distillation

Hugging Face Daily Papers 05/20/26, 12:00 AM Papers

Summary

GenEvolve is a self-evolving image generation framework that uses tool-orchestrated trajectories and visual experience distillation to iteratively improve generative capabilities, achieving state-of-the-art performance.

Open-ended image generation is no longer a simple prompt-to-image problem. High-quality generation often requires an agent to combine a model's internal generative ability with external resources. As requests become more diverse and demanding, we aim to develop a general image-generation agent that can self-evolve through trajectories and use tools more effectively across varied generation challenges. To this end, we propose GenEvolve, a self-evolving framework based on Tool-Orchestrated Visual Experience Distillation. In GenEvolve, each generation attempt is modeled as a tool-orchestrated trajectory, where the agent gathers evidence, selects references, invokes generation skills, and composes them into a prompt-reference program. Unlike existing agentic generation methods that mainly rely on image-level scalar rewards, GenEvolve compares multiple trajectories for the same request and abstracts best-worst differences into structured visual experience, provided only to a privileged teacher branch. Inspired by on-policy self-distillation, Visual Experience Distillation provides dense token-level supervision, helping the student internalize better search, knowledge activation, reference selection, and prompt construction. We further construct GenEvolve-Data and GenEvolve-Bench. Experiments on public benchmarks and GenEvolve-Bench show substantial gains over strong baselines, achieving state-of-the-art performance among current image-generation frameworks. Our website is as follows: https://ephemeral182.github.io/GenEvolve/

Original Article

View Cached Full Text

Cached at: 05/22/26, 10:19 AM

Paper page - GenEvolve: Self-Evolving Image Generation Agents via Tool-Orchestrated Visual Experience Distillation

Source: https://huggingface.co/papers/2605.21605

Abstract

A self-evolving image generation framework uses tool-orchestrated trajectories and visual experience distillation to improve generative capabilities through iterative learning and reference-based prompting.

Open-ended image generation is no longer a simple prompt-to-image problem. High-quality generation often requires an agent to combine a model’s internal generative ability with external resources. As requests become more diverse and demanding, we aim to develop a generalimage-generation agentthat can self-evolve through trajectories and use tools more effectively across varied generation challenges. To this end, we propose GenEvolve, aself-evolving frameworkbased onTool-Orchestrated Visual Experience Distillation. In GenEvolve, each generation attempt is modeled as atool-orchestrated trajectory, where the agent gathers evidence, selects references, invokes generation skills, and composes them into a prompt-reference program. Unlike existing agentic generation methods that mainly rely on image-level scalar rewards, GenEvolve compares multiple trajectories for the same request and abstracts best-worst differences into structured visual experience, provided only to a privileged teacher branch. Inspired byon-policy self-distillation,Visual Experience Distillationprovides dense token-level supervision, helping the student internalize better search, knowledge activation,reference selection, andprompt construction. We further construct GenEvolve-Data and GenEvolve-Bench. Experiments on public benchmarks and GenEvolve-Bench show substantial gains over strong baselines, achieving state-of-the-art performance among current image-generation frameworks. Our website is as follows: https://ephemeral182.github.io/GenEvolve/

View arXiv page View PDF Project page GitHub5 Add to collection

Get this paper in your agent:

hf papers read 2605\.21605

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper1

#### MeiGen-AI/GenEvolve Image-Text-to-Text• 9B• Updatedabout 9 hours ago • 22 • 4

Datasets citing this paper1

#### MeiGen-AI/GenEvolve-Data-Bench Viewer• Updatedabout 9 hours ago • 12.8k • 116 • 1

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2605.21605 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

GenEvolve: Self-Evolving Image Generation Agents via Tool-Orchestrated Visual Experience Distillation

Paper page - GenEvolve: Self-Evolving Image Generation Agents via Tool-Orchestrated Visual Experience Distillation

Abstract

Models citing this paper1

Datasets citing this paper1

Spaces citing this paper0

Collections including this paper0

Similar Articles

EvoMap/evolver

GenClaw: Code-Driven Agentic Image Generation

Verilog-Evolve: Feedback-Driven and Skill-Evolving Verilog Generation

FlashEvolve: Accelerating Agent Self-Evolution with Asynchronous Stage Orchestration

OPD-Evolver: Cultivating Holistic Agent Evolver via On-Policy Distillation

Submit Feedback

Similar Articles

GenClaw: Code-Driven Agentic Image Generation

Verilog-Evolve: Feedback-Driven and Skill-Evolving Verilog Generation

FlashEvolve: Accelerating Agent Self-Evolution with Asynchronous Stage Orchestration

OPD-Evolver: Cultivating Holistic Agent Evolver via On-Policy Distillation