Who Should Lead Decoding Now? Tracking Reliable Trajectories for Ensembling Masked Diffusion Language Models
Summary
This paper proposes TIE, a knowledge fusion framework for masked diffusion language models that tracks confidence dynamics to identify reliable decoding trajectories and iteratively transfers partially denoised sequences between models, improving generation quality on reasoning tasks.
View Cached Full Text
Cached at: 06/16/26, 11:34 AM
Paper page - Who Should Lead Decoding Now? Tracking Reliable Trajectories for Ensembling Masked Diffusion Language Models
Source: https://huggingface.co/papers/2606.16281
Abstract
Masked diffusion language models exhibit unique decoding dynamics where reliable trajectories show stable confidence patterns, enabling iterative ensemble methods that transfer partially denoised sequences between models based on confidence evolution.
Masked Diffusion Language Models(MDLMs) have emerged as a distinct paradigm for sequence generation. As MDLMs become diverse in capabilities and knowledge coverage, an important question is how to combine their knowledge. Toward this, we first investigate the uniquedecoding dynamicsof MDLMs. We find that successful generations exhibit stableconfidence dynamicsover answer-relevant positions, while unreliable trajectories can often be corrected by injecting promising intermediate states from other models. Guided by this observation, we propose TIE (Trajectory-based Iterative Ensembling), a knowledge fusion framework in which MDLMs iteratively identify reliable decoding trajectories and relay them across models. TIE tracksconfidence dynamicsover answer-relevant positions to determine which model currently follows a more reliable trajectory and selectively transferspartially denoised sequencesacross models. As the model on the more promising trajectory often changes acrossdenoising steps, TIE allows different models to contribute complementary strengths at different stages of generation. Strong performance across diverse reasoning tasks, along with our analyses, suggests that TIE offers a practical approach to the underexplored problem of MDLM ensembling.
View arXiv pageView PDFAdd to collection
Get this paper in your agent:
hf papers read 2606\.16281
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2606.16281 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2606.16281 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2606.16281 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
Efficient Diffusion LLMs via Temporal-Spatial Parallel Decoding and Confidence Extrapolation
This paper introduces Temporal-Spatial Parallel Decoding (TSPD) and Confidence Extrapolation (CE) to accelerate inference in diffusion-based large language models by dynamically deciding when tokens have converged and forecasting logit trends, reducing unnecessary denoising steps while preserving output quality.
Masked Diffusion Decoding as $x$-Prediction Flow
This paper reinterprets masked diffusion language model decoding as continuous clean-state prediction, introducing a flow-based framework where tokens are updated continuously and asynchronously based on confidence, achieving 97% of LLaDA's performance with 25% of the decoding budget.
Supportive Token Revealing for Fast Diffusion Language Model Decoding
This paper proposes AXON, a training-free module that improves the quality-latency trade-off of discrete diffusion language model decoding by intelligently selecting 'anchor' tokens to reveal first, using attention, uncertainty, and confidence signals to support subsequent denoising steps. Experiments on reasoning and code-generation benchmarks show AXON reduces function evaluations while maintaining or improving accuracy.
The Confidence Shortcut: A Reasoning Failure Mode of Masked Diffusion Models
This paper identifies a failure mode in masked diffusion language models where confidence-based decoding leads to high-confidence errors on complex reasoning tasks, and shows that confidence-aligned training exacerbates this issue while random masking preserves reasoning performance.
Speculative Refinement: A Hybrid Autoregressive Diffusion Decoding Strategy and Its Behavior Across Benchmarks
Introduces Speculative Refinement (SpecRef), a training-free hybrid decoding strategy that warm-starts a masked diffusion language model from an autoregressive draft using entropy-guided selective masking. Evaluated across six benchmarks, it reveals that code benchmarks conflate structural discovery with logical correctness, identifies a refinement tension phenomenon, and shows that evaluation protocols can produce different model rankings.