Tag
This paper theoretically analyzes diffusion language models through a bias-variance lens, identifying trade-offs between masking and uniform diffusion kernels. It proposes SemDLM+, which adds a global transition and semantic-frequency penalty to overcome the semantic basin problem, achieving competitive generation quality on LM1B and OpenWebText benchmarks.