Tag
This paper introduces Dynamic Chunking for Diffusion Language Models (DCDM), which replaces fixed positional blocks in block discrete diffusion with content-defined semantic chunks using a differentiable Chunking Attention mechanism, achieving consistent improvements across scales up to 1.5B parameters.