Dystruct: Dynamically Structured Diffusion Language Model Decoding via Bayesian Inference

Hugging Face Daily Papers Papers

Summary

DyStruct is a training-free Bayesian decoding framework for discrete Diffusion Language Models that enables flexible-length generation by dynamically determining expansion size and decoding order, improving accuracy on math and code tasks.

Diffusion language models (DLMs) have recently emerged as a promising alternative to autoregressive models, primarily due to their ability to enable parallel decoding. Despite this advantage, most existing DLMs rely on a fixed generation length specified prior to decoding, which restricts their flexibility in real-world applications. While a few recent works attempt to support flexible-length generation, they typically suffer from notable limitations: some require costly retraining to accommodate variable-length outputs, while others depend solely on local confidence signals during decoding. Such local criteria fail to capture the evolving structure of the sequence, often resulting in suboptimal generation quality. In this paper, we propose a training-free, Bayesian structured decoding framework that formulates flexible-length generation as a dynamic structural inference problem. Our approach formulates flexible-length generation as a dynamic structural inference problem, jointly computing the expansion length, the block boundaries, and the decoding schedule. At each window expansion step, the method integrates local uncertainty with structural signals via a unified mechanism that supports dynamic structured generation, including both flexible block expansion and block organization, while maintaining coherence. Extensive experiments across multiple benchmarks demonstrate that our approach significantly improves generation quality and flexibility over existing fixed-length and flexible-length baselines. These results highlight the advantage of Bayesian structured decoding for diffusion language model, providing a principled and efficient solution for structured text generation.
Original Article Export to Word Export to PDF
View Cached Full Text

Cached at: 05/12/26, 07:28 AM

Paper page - Dystruct: Dynamically Structured Diffusion Language Model Decoding via Bayesian Inference

Source: https://huggingface.co/papers/2605.09820 DyStruct is a training-free Bayesian decoding framework that enables flexible-length generation in discrete Diffusion Language Models (DLMs).

While discrete diffusion models offer the architectural advantage of parallel decoding, they are typically constrained by fixed sequence lengths. Existing methods for variable-length generation rely on strictly left-to-right truncation heuristics—which force premature token commitments—or require costly custom alignment training.

DyStruct formulates sequence expansion as a pure inference-time structural problem, utilizing a Bayesian framework to dynamically determine expansion size, block partitioning, and decoding order. The method executes non-monotonically: a Chinese Restaurant Process (CRP) prior and context-aware Gibbs scheduling actively search for and anchor stable sequence segments first (such as initial setups and final answer formats). These stable anchors are then used to bidirectionally constrain highly unstable intermediate reasoning steps.

By allocating unmasking iterations based strictly on structural instability, the algorithm naturally terminates early on rigid tasks (such as arithmetic templates) to optimize compute, while reserving deep refinement steps for complex logic. Evaluated on LLaDA-8B and Dream-7B, this approach yields strict accuracy improvements across mathematical reasoning and code synthesis, including a +4.4 exact match increase on Big-Bench Hard.

Similar Articles

DALM: A Domain-Algebraic Language Model via Three-Phase Structured Generation

arXiv cs.CL

DALM proposes a domain-algebraic language model that generates text under exact structural constraints derived from a domain lattice, addressing hallucination by organizing knowledge into separate domain fibers with algebraic guarantees. The model uses three-phase structured denoising (domain → relation → concept) with domain-annotated training data to prevent cross-domain contamination.

DFlash: Block Diffusion for Flash Speculative Decoding

Papers with Code Trending

DFlash is a new speculative decoding framework that uses a lightweight block diffusion model for parallel token drafting, achieving over 6x acceleration compared to autoregressive methods. It significantly outperforms existing state-of-the-art methods like EAGLE-3 while maintaining high output quality.