High-Fidelity Two-Step Image Generation via Teacher-Aligned End-to-End Distillation

Hugging Face Daily Papers 06/10/26, 12:00 AM Papers

Summary

This paper introduces Z-Image Turbo++, a two-step image generation model distilled from an eight-step teacher using distribution-aligned adversarial learning, step-decoupled parameterization, and end-to-end training with iterative regularization to narrow the quality gap with multi-step generation.

Few-step diffusion distillation has become increasingly mature for 4-8-step generation, yet pushing further to 2 steps remains challenging. In this work, we introduce Z-Image Turbo++, a high-quality 2-step image generation model distilled from the 8-step Z-Image Turbo teacher. Our method addresses the central bottlenecks of increased task difficulty and limited model capacity in 2-step generation through three simple but effective design choices tailored to this regime. First, we propose Distribution-Aligned Adversarial Learning, which uses teacher-generated images rather than external real images as real samples for GAN training, providing a more attainable and informative adversarial target. Second, we adopt Step-Decoupled Parameterization, assigning independent model parameters to the two denoising steps to better match their distinct capacity demands. Third, we perform End-to-End Training with Iterative Regularization, allowing the first step to receive gradients from final image quality while preserving a meaningful intermediate generation through an explicit step-1 loss. Together, these designs substantially narrow the quality gap between 2-step and 8-step generation in both qualitative and quantitative evaluations, highlighting the potential of carefully tailored distillation strategies for improving the quality-efficiency trade-off in few-step generation.

Original Article

View Cached Full Text

Cached at: 06/12/26, 06:50 AM

Paper page - High-Fidelity Two-Step Image Generation via Teacher-Aligned End-to-End Distillation

Source: https://huggingface.co/papers/2606.12575

Abstract

A 2-step image generation model is developed through distillation from an 8-step teacher using distribution-aligned adversarial learning, step-decoupled parameterization, and end-to-end training with iterative regularization.

Few-stepdiffusion distillationhas become increasingly mature for 4-8-step generation, yet pushing further to 2 steps remains challenging. In this work, we introduceZ-Image Turbo++, a high-quality 2-step image generation model distilled from the 8-step Z-Image Turbo teacher. Our method addresses the central bottlenecks of increased task difficulty and limited model capacity in 2-step generation through three simple but effective design choices tailored to this regime. First, we proposeDistribution-Aligned Adversarial Learning, which uses teacher-generated images rather than external real images as real samples for GAN training, providing a more attainable and informative adversarial target. Second, we adoptStep-Decoupled Parameterization, assigning independent model parameters to the twodenoising stepsto better match their distinct capacity demands. Third, we performEnd-to-End TrainingwithIterative Regularization, allowing the first step to receive gradients from final image quality while preserving a meaningful intermediate generation through an explicit step-1 loss. Together, these designs substantially narrow the quality gap between 2-step and 8-step generation in both qualitative and quantitative evaluations, highlighting the potential of carefully tailored distillation strategies for improving the quality-efficiency trade-off in few-step generation.

View arXiv page View PDF Add to collection

Get this paper in your agent:

hf papers read 2606\.12575

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2606.12575 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2606.12575 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2606.12575 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

High-Fidelity Two-Step Image Generation via Teacher-Aligned End-to-End Distillation

Paper page - High-Fidelity Two-Step Image Generation via Teacher-Aligned End-to-End Distillation

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper0

Similar Articles

Qwen-Image-Flash (26 minute read)

@HuggingPapers: Alibaba released Qwen-Image-Flash Few-step distillation goes beyond objectives. Data composition, teacher guidance, and…

Reinforcing Few-step Generators via Reward-Tilted Distribution Matching

Qwen-Image-Flash: Beyond Objective Design

@jiqizhixin: What if you could generate high-quality images in one step instead of hundreds? Stanford and ByteDance introduce W-Flow…

Submit Feedback

Similar Articles

Qwen-Image-Flash (26 minute read)

@HuggingPapers: Alibaba released Qwen-Image-Flash Few-step distillation goes beyond objectives. Data composition, teacher guidance, and…

Reinforcing Few-step Generators via Reward-Tilted Distribution Matching

Qwen-Image-Flash: Beyond Objective Design

@jiqizhixin: What if you could generate high-quality images in one step instead of hundreds? Stanford and ByteDance introduce W-Flow…