Tag
NVIDIA releases Nemotron-Labs-Diffusion, a family of tri-mode language models (3B, 8B, 14B) supporting AR, diffusion, and self-speculation decoding, achieving 2.7x-4x speed-ups over standard AR decoding.