Normalizing Trajectory Models

Hugging Face Daily Papers Papers

Summary

This paper introduces Normalizing Trajectory Models (NTM), a novel approach to diffusion-based generation that models reverse steps as conditional normalizing flows with exact likelihood training. NTM enables high-quality text-to-image generation in just four steps while retaining the likelihood framework, outperforming baselines on standard benchmarks.

Diffusion-based models decompose sampling into many small Gaussian denoising steps -- an assumption that breaks down when generation is compressed to a few coarse transitions. Existing few-step methods address this through distillation, consistency training, or adversarial objectives, but sacrifice the likelihood framework in the process. We introduce Normalizing Trajectory Models (NTM), which models each reverse step as an expressive conditional normalizing flow with exact likelihood training. Architecturally, NTM combines shallow invertible blocks within each step with a deep parallel predictor across the trajectory, forming an end-to-end network trainable from scratch or initializable from pretrained flow-matching models. Its exact trajectory likelihood further enables self-distillation: a lightweight denoiser trained on the model's own score produces high-quality samples in four steps. On text-to-image benchmarks, NTM matches or outperforms strong image generation baselines in just four sampling steps while uniquely retaining exact likelihood over the generative trajectory.
Original Article
View Cached Full Text

Cached at: 05/11/26, 02:42 AM

Paper page - Normalizing Trajectory Models

Source: https://huggingface.co/papers/2605.08078

Abstract

Normalizing Trajectory Models introduce a novel approach to diffusion-based generation by modeling each reverse step as an expressive conditional normalizing flow with exact likelihood training, enabling high-quality sample generation in few steps while maintaining likelihood framework.

Diffusion-based modelsdecompose sampling into many smallGaussian denoising steps-- an assumption that breaks down when generation is compressed to a few coarse transitions. Existing few-step methods address this through distillation, consistency training, or adversarial objectives, but sacrifice the likelihood framework in the process. We introduce Normalizing Trajectory Models (NTM), which models each reverse step as an expressive conditional normalizing flow with exactlikelihood training. Architecturally, NTM combines shallowinvertible blockswithin each step with a deep parallel predictor across the trajectory, forming an end-to-end network trainable from scratch or initializable from pretrainedflow-matching models. Its exact trajectory likelihood further enablesself-distillation: a lightweight denoiser trained on the model’s own score produces high-quality samples in four steps. Ontext-to-image benchmarks, NTM matches or outperforms strong image generation baselines in just four sampling steps while uniquely retaining exact likelihood over the generative trajectory.

View arXiv pageView PDFAdd to collection

Get this paper in your agent:

hf papers read 2605\.08078

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2605.08078 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2605.08078 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2605.08078 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

Similar Articles

Trajectory as the Teacher: Few-Step Discrete Flow Matching via Energy-Navigated Distillation

Hugging Face Daily Papers

This paper introduces Trajectory-Shaped Discrete Flow Matching (TS-DFM), which replaces blind stochastic jumps with guided navigation to significantly improve text generation efficiency and reduce computational costs. The method achieves superior perplexity and speed compared to traditional multi-step baselines while maintaining unchanged inference costs.

LeapAlign: Post-Training Flow Matching Models at Any Generation Step by Building Two-Step Trajectories

Hugging Face Daily Papers

LeapAlign is a post-training method that improves flow matching model alignment with human preferences by reducing computational costs through two-step trajectory shortcuts while enabling stable gradient propagation to early generation steps. The method outperforms state-of-the-art approaches when fine-tuning Flux models across various image quality and text-alignment metrics.