Tag
Parallel Rollout Approximation (PRA) improves pixel-space autoregressive image generation by using low-dimensional intermediate states and parallel training, achieving new state-of-the-art results on ImageNet-1K generation.
A new technique called Spectral Forcing applies a time-conditional 2D-DCT low-pass operator to pixel-space diffusion models, improving efficiency by explicitly separating signal from noise and outperforming baselines on ImageNet and text-to-image tasks.
AsymFlow is a new method from Stanford that converts latent diffusion models to pixel space, achieving more realistic images by avoiding information loss from compression. It surpasses FLUX.2 klein on benchmarks with lower computational cost.
Asymmetric Flow Modeling (AsymFlow) restricts noise prediction to low-rank subspaces for efficient high-dimensional flow-based generation, achieving state-of-the-art results on ImageNet and text-to-image tasks by fine-tuning from latent flow models.
The L2P paper introduces a Latent-to-Pixel transfer paradigm that leverages pre-trained latent diffusion models to create efficient pixel-space models capable of 4K generation with minimal training overhead.