Trajectory as the Teacher: Few-Step Discrete Flow Matching via Energy-Navigated Distillation
Summary
This paper introduces Trajectory-Shaped Discrete Flow Matching (TS-DFM), which replaces blind stochastic jumps with guided navigation to significantly improve text generation efficiency and reduce computational costs. The method achieves superior perplexity and speed compared to traditional multi-step baselines while maintaining unchanged inference costs.
View Cached Full Text
Cached at: 05/11/26, 06:55 PM
Paper page - Trajectory as the Teacher: Few-Step Discrete Flow Matching via Energy-Navigated Distillation
Source: https://huggingface.co/papers/2605.07924
Abstract
Discrete flow matching with trajectory-shaped guidance improves text generation efficiency by replacing stochastic jumps with guided navigation, achieving better performance than traditional methods with significantly reduced computational requirements.
Discrete flow matchinggenerates text by iteratively transforming noise tokens into coherent language, but may require hundreds offorward passes.Distillationuses themulti-step trajectoryto train a student to reproduce the process in a few steps. When the student underperforms, the usual explanation is insufficient capacity. We argue the opposite: the trajectory is the bottleneck, not the student. Each training trajectory is built through a chain of blindstochastic jumpswith no evaluation of sequence quality; a single bad decision at an early midpoint propagates through subsequent steps, yet the student must imitate the result. Trajectory-ShapedDiscrete Flow Matching(TS-DFM) replaces these blind jumps with guided navigation: a lightweightenergy compassevaluates candidate continuations at each midpoint, selecting the most coherent. All shaping is training-only;inference costis unchanged. On 170M-parameterlanguage modeling, the shaped student at 8 steps achieves 32% lowerperplexitythan the 1,024-step teacher while being 128x faster, with gains consistent across source distributions and three evaluators of increasing scale. TS-DFM achieves the bestperplexityof any discrete-generation baseline we compare against, including methods trained on 6x more data or using 5x larger models.
View arXiv pageView PDFAdd to collection
Get this paper in your agent:
hf papers read 2605\.07924
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2605.07924 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2605.07924 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2605.07924 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
LeapAlign: Post-Training Flow Matching Models at Any Generation Step by Building Two-Step Trajectories
LeapAlign is a post-training method that improves flow matching model alignment with human preferences by reducing computational costs through two-step trajectory shortcuts while enabling stable gradient propagation to early generation steps. The method outperforms state-of-the-art approaches when fine-tuning Flux models across various image quality and text-alignment metrics.
Normalizing Trajectory Models
This paper introduces Normalizing Trajectory Models (NTM), a novel approach to diffusion-based generation that models reverse steps as conditional normalizing flows with exact likelihood training. NTM enables high-quality text-to-image generation in just four steps while retaining the likelihood framework, outperforming baselines on standard benchmarks.
FlowLM: Few-Step Language Modeling via Diffusion-to-Flow Adaptation
FlowLM introduces a flow matching language model derived from pre-trained diffusion models via efficient fine-tuning, enabling high-quality few-step text generation that rivals 2,000-step diffusion sampling with far fewer training epochs.
Self-Distilled Trajectory-Aware Boltzmann Modeling: Bridging the Training-Inference Discrepancy in Diffusion Language Models
This paper introduces TABOM, a self-distilled trajectory-based post-training framework for Diffusion Language Models that aligns training with inference trajectories using Boltzmann modeling to mitigate the training-inference discrepancy and reduce catastrophic forgetting.
Geometric Erasure by Contrastive Velocity Matching in Rectified Flows
This paper introduces GEM, a concept erasure framework for Rectified Flow models that combines trajectory-based unlearning with teacher-guided flow matching, achieving 5× faster and safer content suppression while preserving benign generation.