efficient-decoding

Tag

Cards List
#efficient-decoding

nvidia/Nemotron-Labs-Diffusion-14B

Hugging Face Models Trending · 2026-04-22 Cached

NVIDIA releases Nemotron-Labs-Diffusion, a family of tri-mode language models (3B, 8B, 14B) supporting AR, diffusion, and self-speculation decoding, achieving 2.7x-4x speed-ups over standard AR decoding.

0 favorites 0 likes
← Back to home

Submit Feedback