Asymmetric Flow Models
Summary
Asymmetric Flow Modeling (AsymFlow) restricts noise prediction to low-rank subspaces for efficient high-dimensional flow-based generation, achieving state-of-the-art results on ImageNet and text-to-image tasks by fine-tuning from latent flow models.
View Cached Full Text
Cached at: 05/14/26, 04:17 AM
Paper page - Asymmetric Flow Models
Source: https://huggingface.co/papers/2605.12964
Abstract
Asymmetric Flow Modeling enables efficient high-dimensional flow-based generation by restricting noise prediction to low-rank subspaces while maintaining full-dimensional data prediction, achieving superior performance in pixel-space text-to-image generation through effective fine-tuning from latent models.
Flow-based generationin high-dimensional spaces is difficult becausevelocity predictionrequires modelinghigh-dimensional noise, even when data has stronglow-rank structure. We present Asymmetric Flow Modeling (AsymFlow), arank-asymmetric velocity parameterizationthat restricts noise prediction to a low-rank subspace while keeping data prediction full-dimensional. From this asymmetric prediction, AsymFlow analytically recovers the full-dimensional velocity without changing the network architecture or training/sampling procedures. On ImageNet 256times256, AsymFlow achieves a leading 1.57 FID, outperforming prior DiT/JiT-likepixel diffusion modelsby a large margin. AsymFlow also provides the first-ever route for finetuning pretrainedlatent flow modelsinto pixel-space models: aligning the low-rank pixel subspace to thelatent spacegives a seamless initialization that preserves the latent model’s high-level semantics and structure, so finetuning mainly improves low-level mismatches rather than relearning pixel generation. We show that the pixel AsymFlow model finetuned from FLUX.2 klein 9B establishes a new state of the art for pixel-spacetext-to-image generation, beating its latent base on HPSv3, DPG-Bench, and GenEval while qualitatively showing substantially improved visual realism.
View arXiv pageView PDFProject pageGitHub290Add to collection
Get this paper in your agent:
hf papers read 2605\.12964
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper2
#### Lakonik/AsymFlow-ImageNet Updatedabout 2 hours ago
#### Lakonik/AsymFLUX.2-klein-9B Text-to-Image• Updatedabout 2 hours ago • 2
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2605.12964 in a dataset README.md to link it from this page.
Spaces citing this paper1
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
AsymFlow Claims More Realistic AI Images by Moving Beyond Latent Diffusion
AsymFlow is a new method from Stanford that converts latent diffusion models to pixel space, achieving more realistic images by avoiding information loss from compression. It surpasses FLUX.2 klein on benchmarks with lower computational cost.
MeshFlow: Mesh Generation with Equivariant Flow Matching
MeshFlow introduces an equivariant optimal-transport flow matching model for direct triangle mesh generation, achieving state-of-the-art quality while providing approximately 18x inference speedup over autoregressive methods.
Language Modeling with Hyperspherical Flows
This paper introduces S-FLM, a novel flow-based language model that operates in a hyperspherical latent space to address the computational costs and semantic limitations of existing discrete diffusion and continuous flow models.
Multi-Resolution Flow Matching: Training-Free Diffusion Acceleration via Staged Sampling
MrFlow is a training-free multi-resolution acceleration strategy for flow-matching text-to-image models that combines low-resolution generation with pixel-space super-resolution and noise injection, achieving up to 25x end-to-end speedup without training or runtime modifications.
On-Policy Adversarial Flow Distillation for Autoregressive Video Generation
Proposes Adversarial Flow Distillation (AFD) for distilling heterogeneous black-box video generation models into autoregressive students, using on-policy feedback and forward-process flow-matching updates.