Tag
Stanford and ByteDance introduce W-Flow, a single-step generative model that uses Wasserstein gradient flows to achieve state-of-the-art one-step ImageNet 256x256 generation (1.29 FID) with 100x faster sampling than multi-step diffusion models.
One-Forcing improves one-step video generation by augmenting the DMD objective with an auxiliary GAN loss, achieving state-of-the-art performance with reduced training costs.
Introduces Discrete MeanFlow, a method for one-step generation in discrete state spaces by learning conditional transition kernels of continuous-time Markov chains, avoiding iterative denoising.
Researchers extend MeanFlow one-step image generation from class labels to flexible text inputs by integrating highly-discriminative LLM-based text encoders, enabling efficient text-conditioned synthesis with improved performance.