High-Fidelity Two-Step Image Generation via Teacher-Aligned End-to-End Distillation

Hugging Face Daily Papers Papers

Summary

This paper introduces Z-Image Turbo++, a two-step image generation model distilled from an eight-step teacher using distribution-aligned adversarial learning, step-decoupled parameterization, and end-to-end training with iterative regularization to narrow the quality gap with multi-step generation.

Few-step diffusion distillation has become increasingly mature for 4-8-step generation, yet pushing further to 2 steps remains challenging. In this work, we introduce Z-Image Turbo++, a high-quality 2-step image generation model distilled from the 8-step Z-Image Turbo teacher. Our method addresses the central bottlenecks of increased task difficulty and limited model capacity in 2-step generation through three simple but effective design choices tailored to this regime. First, we propose Distribution-Aligned Adversarial Learning, which uses teacher-generated images rather than external real images as real samples for GAN training, providing a more attainable and informative adversarial target. Second, we adopt Step-Decoupled Parameterization, assigning independent model parameters to the two denoising steps to better match their distinct capacity demands. Third, we perform End-to-End Training with Iterative Regularization, allowing the first step to receive gradients from final image quality while preserving a meaningful intermediate generation through an explicit step-1 loss. Together, these designs substantially narrow the quality gap between 2-step and 8-step generation in both qualitative and quantitative evaluations, highlighting the potential of carefully tailored distillation strategies for improving the quality-efficiency trade-off in few-step generation.
Original Article
View Cached Full Text

Cached at: 06/12/26, 06:50 AM

Paper page - High-Fidelity Two-Step Image Generation via Teacher-Aligned End-to-End Distillation

Source: https://huggingface.co/papers/2606.12575

Abstract

A 2-step image generation model is developed through distillation from an 8-step teacher using distribution-aligned adversarial learning, step-decoupled parameterization, and end-to-end training with iterative regularization.

Few-stepdiffusion distillationhas become increasingly mature for 4-8-step generation, yet pushing further to 2 steps remains challenging. In this work, we introduceZ-Image Turbo++, a high-quality 2-step image generation model distilled from the 8-step Z-Image Turbo teacher. Our method addresses the central bottlenecks of increased task difficulty and limited model capacity in 2-step generation through three simple but effective design choices tailored to this regime. First, we proposeDistribution-Aligned Adversarial Learning, which uses teacher-generated images rather than external real images as real samples for GAN training, providing a more attainable and informative adversarial target. Second, we adoptStep-Decoupled Parameterization, assigning independent model parameters to the twodenoising stepsto better match their distinct capacity demands. Third, we performEnd-to-End TrainingwithIterative Regularization, allowing the first step to receive gradients from final image quality while preserving a meaningful intermediate generation through an explicit step-1 loss. Together, these designs substantially narrow the quality gap between 2-step and 8-step generation in both qualitative and quantitative evaluations, highlighting the potential of carefully tailored distillation strategies for improving the quality-efficiency trade-off in few-step generation.

View arXiv pageView PDFAdd to collection

Get this paper in your agent:

hf papers read 2606\.12575

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2606.12575 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2606.12575 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2606.12575 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

Similar Articles

Qwen-Image-Flash (26 minute read)

TLDR AI

This paper from Alibaba revisits few-step distillation for visual generative models, focusing on training recipe factors such as data composition, teacher guidance, and task mixture, using Qwen-Image-2.0 as a case study to develop Qwen-Image-Flash.

Reinforcing Few-step Generators via Reward-Tilted Distribution Matching

Hugging Face Daily Papers

RTDMD is a two-stage framework combining distribution matching distillation with reward-guided reinforcement learning to improve few-step image generation alignment with human preferences. It achieves state-of-the-art results on multiple models with only 4 inference steps.

Qwen-Image-Flash: Beyond Objective Design

Hugging Face Daily Papers

This paper investigates training recipes for few-step distillation of visual generative models, using Qwen-Image-2.0 as a case study. It reveals non-obvious behaviors and proposes Qwen-Image-Flash.