Multi-Resolution Flow Matching: Training-Free Diffusion Acceleration via Staged Sampling

Hugging Face Daily Papers Papers

Summary

MrFlow is a training-free multi-resolution acceleration strategy for flow-matching text-to-image models that combines low-resolution generation with pixel-space super-resolution and noise injection, achieving up to 25x end-to-end speedup without training or runtime modifications.

Hardware-agnostic strategies for accelerating text-to-image diffusion, such as timestep distillation and feature caching, can reduce inference time without custom kernels or system-level optimization. Among them, multi-resolution generation strategies have recently received broad attention, attaining more than 5x speedup without any training. However, the design of performing upsampling in the latent space, together with the selective modification of partial regions, causes these methods to exhibit noticeable blurring or artifacts. To this end, we propose MrFlow, a training-free multi-resolution acceleration strategy for pretrained flow-matching models built upon a staged low-to-high-resolution pipeline. MrFlow first rapidly generates the main structure at low resolution, then performs super-resolution in the pixel space using a lightweight pretrained GAN-based model, subsequently injects low-strength noise to enable high-frequency resampling, and finally refines the details at high resolution. Quantitative and qualitative results on FLUX.1-dev and Qwen-Image show that MrFlow exploits the quadratic token reduction and reduced step requirement of low-resolution sampling to achieve 10x end-to-end acceleration while keeping OneIG within a 1% gap relative to that before acceleration, significantly surpassing other training-free acceleration strategies, and requiring no training or runtime dynamic identification whatsoever. MrFlow can further be directly combined orthogonally with pre-trained timestep distillation strategies, achieving even higher generation acceleration of up to 25x.
Original Article
View Cached Full Text

Cached at: 07/03/26, 03:52 AM

Paper page - Multi-Resolution Flow Matching: Training-Free Diffusion Acceleration via Staged Sampling

Source: https://huggingface.co/papers/2607.01642

Abstract

MrFlow accelerates text-to-image diffusion by combining low-resolution generation with pixel-space super-resolution and noise injection, achieving up to 25x speedup without training or runtime modifications.

Hardware-agnostic strategies for acceleratingtext-to-image diffusion, such astimestep distillationand feature caching, can reduce inference time without custom kernels or system-level optimization. Among them,multi-resolution generationstrategies have recently received broad attention, attaining more than 5x speedup without any training. However, the design of performing upsampling in the latent space, together with the selective modification of partial regions, causes these methods to exhibit noticeable blurring or artifacts. To this end, we propose MrFlow, a training-free multi-resolution acceleration strategy for pretrainedflow-matching modelsbuilt upon astaged low-to-high-resolution pipeline. MrFlow first rapidly generates the main structure at low resolution, then performssuper-resolutionin the pixel space using a lightweightpretrained GAN-based model, subsequently injects low-strength noise to enable high-frequency resampling, and finally refines the details at high resolution. Quantitative and qualitative results on FLUX.1-dev and Qwen-Image show that MrFlow exploits thequadratic token reductionand reduced step requirement of low-resolution sampling to achieve 10x end-to-end acceleration while keeping OneIG within a 1% gap relative to that before acceleration, significantly surpassing other training-free acceleration strategies, and requiring no training or runtime dynamic identification whatsoever. MrFlow can further be directly combined orthogonally with pre-trainedtimestep distillationstrategies, achieving even higher generation acceleration of up to 25x.

View arXiv pageView PDFGitHub3Add to collection

Get this paper in your agent:

hf papers read 2607\.01642

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2607.01642 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2607.01642 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2607.01642 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

Similar Articles

Flow-OPD: On-Policy Distillation for Flow Matching Models

Hugging Face Daily Papers

Flow-OPD is a research paper introducing a two-stage on-policy distillation framework for Flow Matching text-to-image models, significantly improving generation quality and alignment metrics using Stable Diffusion 3.5 Medium.

Recursive Flow Matching

Hugging Face Daily Papers

Introduces Recursive Flow Matching (RecFM), a generative framework for forecasting complex spatiotemporal dynamics that achieves high fidelity with fewer steps and improved accuracy and speed, including up to 20x speedup over diffusion-based emulators.