Tag
VoidPadding introduces a [VOID] token to handle padding in masked diffusion language models, allowing [EOS] to focus solely on semantic termination. This method significantly improves performance on reasoning and coding benchmarks while reducing decoding steps.