Tag
NVIDIA released Nemotron-TwoTower-30B-A3B-Base-BF16, a diffusion-based language model that uses block-wise autoregressive diffusion to generate text by iterative denoising of token blocks, achieving 2.42× the generation throughput of the autoregressive baseline while retaining 98.7% of benchmark quality.
The author claims that Nemotron 3 Ultra is more intelligently capable than GPT 5.5.
The tweet compares the post-training methods of Nemotron 3 Ultra and DeepSeek V4, noting both use multiple specialist teachers and on-policy distillation into a single student, but differ in support overlap.
Comparison of four large language models (≤120B parameters) on deep context performance using Strix Halo hardware. Nemotron Super excels in prompt processing speed at deep context depths compared to GPT-OSS and Qwen models.
Nous Research and NVIDIA have independently converged on the same architecture for persistent AI agents that live on servers and improve daily, marking a shift from coding copilots to autonomous server-side agents.
NVIDIA released the Nemotron repository with open training recipes, pipelines, and model weights for their Nemotron models, including the new Nemotron 3 Ultra and Nemotron 3 Nano Omni, supporting agentic AI and multimodal capabilities.
The author criticizes Frontier AI (GPT5.5 xHigh) for incorrectly suggesting Tensor Parallelism for a model that fits on a single GPU, and announces a planned shootout comparing several AI models (GPT5.5, Opus 4.8, Qwen variants, Nemotron) on a real-world problem.
A user shares that NVIDIA is currently offering top-tier AI models like Nemotron Ultra, DS4flash, Kimi, GLM, and Minimax3 for free with rate limiting, potentially benefiting personal users.
A comparison of four frontier AI models (Nemotron 3 Ultra, DeepSeek V4, MiniMax M3, Qwen 3.7 Max) on the same two prompts, with full results linked.
Nvidia's Nemotron series of AI models is fully open source, with benchmarks, GitHub repos, data, and weights available, performing competitively with NVFP4 benchmarks only 1% away.
Nemotron 3 Ultra is an open-weight release with an impressive capability-to-efficiency ratio, using a Mamba-2-attention hybrid stack and LatentMoE, and is larger than the previous Super variant.
NVIDIA has released Nemotron 3 Ultra, a new model designed to power faster and more efficient reasoning for long-running AI agents.
NVIDIA released Nemotron 3.5 ASR, an open-source multilingual speech-to-text model with the lowest latency tested, available in multilingual and English-only variants, ideal for voice agents and self-hosted deployments.
NVIDIA released Nemotron Ultra, a hybrid MoE model with 55B/550B parameters and a 1M context window, supporting MTP speculative decoding and available day-0 in transformers.
NVIDIA releases Nemotron-3-Ultra, a 550B-parameter open-weight model with a hybrid architecture combining Mamba-2, MoE, and attention, supporting up to 1M token context and configurable reasoning mode.
NVIDIA announces the upcoming release of Nemotron 3 Ultra this week.
Jensen Huang hints at more Nemotron model releases, highlighting open-source frontier intelligence and cost efficiency enabled by NVFP4 training.
NVIDIA announces the Nemotron 3 Ultra AI model.
NVIDIA introduces the MCG Toolkit, an automated pipeline that generates compliant model documentation (Model Card++ format) from source code in under a minute, leveraging RAG and NIM microservices.
Precomputed embedding vectors for the Nemotron-Personas dataset using Qwen 0.6B, enabling semantic search and clustering of synthetic personas via a web demo.