Tag
The Fast Byte Latent Transformer (BLT-D) has been accepted to ICML 2026, introducing a text diffusion method for parallel byte-level decoding to overcome the speed limitations of traditional byte-level language models.
This paper introduces BLT Diffusion and speculative decoding techniques for byte-level language models to significantly reduce generation latency and memory bandwidth costs while maintaining quality.