Improved Techniques for Training Consistency Models
Summary
OpenAI presents improved techniques for training consistency models that enable high-quality single-step image generation without distillation, achieving significant FID improvements on CIFAR-10 and ImageNet 64×64 through novel loss functions and training strategies.
View Cached Full Text
Cached at: 04/20/26, 02:54 PM
Similar Articles
Consistency Models
OpenAI introduces Consistency Models, a new family of generative models that enable fast one-step image generation by directly mapping noise to data, while supporting multi-step sampling and zero-shot editing tasks like inpainting and super-resolution. The approach achieves state-of-the-art FID scores on CIFAR-10 and ImageNet 64x64 for one-step generation.
Simplifying, stabilizing, and scaling continuous-time consistency models
OpenAI presents sCM (simplified continuous-time consistency models), a new approach that scales consistency models to 1.5B parameters and achieves ~50x speedup over diffusion models by generating high-quality samples in just 2 steps. The method demonstrates comparable sample quality to state-of-the-art diffusion models while using less than 10% of the effective sampling compute.
High-Fidelity Two-Step Image Generation via Teacher-Aligned End-to-End Distillation
This paper introduces Z-Image Turbo++, a two-step image generation model distilled from an eight-step teacher using distribution-aligned adversarial learning, step-decoupled parameterization, and end-to-end training with iterative regularization to narrow the quality gap with multi-step generation.
@jiqizhixin: What if you could generate high-quality images in one step instead of hundreds? Stanford and ByteDance introduce W-Flow…
Stanford and ByteDance introduce W-Flow, a single-step generative model that uses Wasserstein gradient flows to achieve state-of-the-art one-step ImageNet 256x256 generation (1.29 FID) with 100x faster sampling than multi-step diffusion models.
OpenAI cooked with the new Images 2 Model, the characters can stay extremely consistent, while text is clear and stays the same
OpenAI released an upgraded image model that keeps character appearance perfectly consistent across frames and renders crisp, stable text.