Tag
SOLAR is a framework that automatically derives validated speed-of-light performance bounds from PyTorch and JAX source code using an LLM frontend and deterministic analysis, enabling headroom analysis and optimization insights for deep learning workloads.
The paper proposes Nemotron-TwoTower, a diffusion language model that decouples context representation and denoising using a frozen autoregressive tower and a trainable diffusion denoiser, achieving 98.7% of baseline quality with 2.42x throughput.
A tweet shares an anecdote about NVIDIA's engineering culture, where the lack of layoffs fosters collaboration instead of internal competition.
Liquid AI's LFM2.5-230M model demonstrates multi-step tool-calling capabilities on a Unitree G1 robot, running entirely on-device on an NVIDIA Jetson Orin, acting as a skill-selection layer.
NVIDIA announces GeForce NOW summer sale discounts and new game additions to the cloud gaming library during the Steam Summer Sale, highlighting the benefits of cloud gaming.
NVIDIA released Nemotron-TwoTower-30B-A3B-Base-BF16, a diffusion-based language model that uses block-wise autoregressive diffusion to generate text by iterative denoising of token blocks, achieving 2.42× the generation throughput of the autoregressive baseline while retaining 98.7% of benchmark quality.
A discussion questioning why LLMs haven't helped ROCm and Intel's software ecosystems catch up to CUDA, highlighting NVIDIA's premium pricing and the need for genuine market competition.
The author successfully ran GLM-5.2 with MTP speculative decoding on a 4× DGX Spark (GB10) setup, revealing a missing component in the public build recipe.
Nvidia's AI chips are selling at record high prices in China due to US export restrictions, while the company also announced a new liquid-cooling system to reduce data center water usage.
The article discusses a sell-off in AI-related tech stocks, raising doubts about whether the massive spending on artificial intelligence will yield returns. It highlights market volatility, with major companies like Micron, Nvidia, and Alphabet experiencing significant drops.
NVIDIA and AWS announce new EC2 G7 instances with NVIDIA RTX PRO 4500 Blackwell GPUs and GPU-accelerated vector search in Amazon OpenSearch Serverless, enabling enterprises to deploy AI at production scale with improved performance and reduced operational complexity.
NVIDIA announces DFlash, an open source block diffusion model for speculative decoding that achieves up to 15x higher inference throughput on Blackwell GPUs while maintaining interactivity.
NVIDIA's new chips enable running 500B parameter models locally, highlighting that AI safety measures are merely behavioral speed bumps that vanish offline, posing unprecedented risks for deception and manipulation at scale.
Nvidia has quietly acquihired the team from Essential AI, including Transformer paper coauthor Ashish Vaswani, who was struggling to raise funds for his startup. Vaswani will work on Nvidia's Nemotron open-source models.
NVIDIA launches the BioNeMo Agent Toolkit, an open toolkit that enables AI agents to perform tasks like protein structure prediction, molecular docking, and generative chemistry, accelerating programmable biology in collaboration with Arc Institute.
Nvidia claims a 15x speedup in text generation using a diffusion model, generating entire blocks at once.
Analyzes hiring data across major AI labs to infer strategic directions, noting xAI's focus on scientific tutors, Nvidia's data center push, and OpenAI's engineering growth.
New GGUF quantizations of Qwen3.6-27B optimized for 16GB VRAM NVIDIA GPUs, including an experimental Trellis variant, with perplexity benchmarks.
Spark Doctor is an open-source diagnostic CLI for NVIDIA DGX Spark that collects system, GPU, memory, Docker, and recipe data, applies specific rules, and outputs the likely cause and next steps for common issues.
SGLang provided Day-0 support for DeepSeek-V4, and collaboration between LMSys and NVIDIA engineering teams achieved up to 5x throughput increase in production, with improvements shown on the SemiAnalysis InferenceX dashboard.