Multi-Token Residual Prediction
Summary
Introduces Multi-token Residual Prediction (MRP), a lightweight module for diffusion language models that enables dependency-aware multi-token denoising within a single backbone forward pass, achieving up to 1.42× lossless speedup.
Similar Articles
Supportive Token Revealing for Fast Diffusion Language Model Decoding
This paper proposes AXON, a training-free module that improves the quality-latency trade-off of discrete diffusion language model decoding by intelligently selecting 'anchor' tokens to reveal first, using attention, uncertainty, and confidence signals to support subsequent denoising steps. Experiments on reasoning and code-generation benchmarks show AXON reduces function evaluations while maintaining or improving accuracy.
Efficient Diffusion LLMs via Temporal-Spatial Parallel Decoding and Confidence Extrapolation
This paper introduces Temporal-Spatial Parallel Decoding (TSPD) and Confidence Extrapolation (CE) to accelerate inference in diffusion-based large language models by dynamically deciding when tokens have converged and forecasting logit trends, reducing unnecessary denoising steps while preserving output quality.
RepFusion: Leveraging Multimodal Priors for Denoising in Representation Space
RepFusion proposes using multimodal large language models as noisy representation encoders for diffusion transformers in text-to-image generation, outperforming traditional denoising approaches.
DACA-GRPO: Denoising-Aware Credit Assignment for Reinforcement Learning in Diffusion Language Models
This paper identifies weaknesses in existing reinforcement learning methods for diffusion language models—lack of temporal credit assignment and biased likelihood estimates—and proposes DACA-GRPO, a plug-and-play enhancement that introduces denoising progress scores and stratified masking likelihood, achieving consistent improvements across reasoning, code generation, and constrained generation benchmarks.
Drifting Objectives for Refining Discrete Diffusion Language Models
This paper introduces TokenDrift, a drifting objective that refines discrete diffusion language models by lifting categorical predictions to a continuous semantic space for anti-symmetric drifting, significantly improving generation quality under a fixed number of denoising steps.