Tag
Nemotron 3 Ultra is a 550B parameter hybrid Mamba-Attention mixture-of-experts language model, pre-trained on 20T tokens, extended to 1M context, and post-trained with SFT, RL, and MOPD. It achieves up to 6x higher inference throughput than state-of-the-art LLMs with comparable accuracy, and is open-sourced.