Tag
Introduces Multi-token Residual Prediction (MRP), a lightweight module for diffusion language models that enables dependency-aware multi-token denoising within a single backbone forward pass, achieving up to 1.42× lossless speedup.