nvidia

#nvidia

Got GLM-5.2 + MTP speculative decode running on 4× DGX Spark (GB10) — and the build piece the public recipe is missing

Reddit r/LocalLLaMA ↗ · 8h ago

The author successfully ran GLM-5.2 with MTP speculative decoding on a 4× DGX Spark (GB10) setup, revealing a missing component in the public build recipe.

0 favorites 0 likes

#nvidia

Nvidia's AI Chips Double in Price in China as It Tackles AI's Water Problem

Reddit r/ArtificialInteligence ↗ · 16h ago Cached

Nvidia's AI chips are selling at record high prices in China due to US export restrictions, while the company also announced a new liquid-cooling system to reduce data center water usage.

0 favorites 0 likes

#nvidia

Is AI 'one big bubble'? Behind the tech sell-off

Reddit r/artificial ↗ · 20h ago Cached

The article discusses a sell-off in AI-related tech stocks, raising doubts about whether the massive spending on artificial intelligence will yield returns. It highlights market volatility, with major companies like Micron, Nvidia, and Alphabet experiencing significant drops.

0 favorites 0 likes

#nvidia

NVIDIA and AWS Collaborate to Bring AI to Production at Scale

NVIDIA Blog ↗ · yesterday Cached

NVIDIA and AWS announce new EC2 G7 instances with NVIDIA RTX PRO 4500 Blackwell GPUs and GPU-accelerated vector search in Amazon OpenSearch Serverless, enabling enterprises to deploy AI at production scale with improved performance and reduced operational complexity.

0 favorites 0 likes

#nvidia

@charles_irl: dflash go brr

X AI KOLs Timeline ↗ · yesterday Cached

NVIDIA announces DFlash, an open source block diffusion model for speculative decoding that achieves up to 15x higher inference throughput on Blackwell GPUs while maintaining interactivity.

0 favorites 0 likes

#nvidia

NVIDIA's new chips just proved AI "safety" was always theater. We are not ready for 2029.

Reddit r/ArtificialInteligence ↗ · yesterday

NVIDIA's new chips enable running 500B parameter models locally, highlighting that AI safety measures are merely behavioral speed bumps that vanish offline, posing unprecedented risks for deception and manipulation at scale.

0 favorites 0 likes

#nvidia

AI Bubble about to Burst? Nvidia quietly acquihires Essential AI team, including Transformer coauthor Ashish Vaswani. Vaswani was struggling to raise money for his AI company.

Reddit r/ArtificialInteligence ↗ · yesterday

Nvidia has quietly acquihired the team from Essential AI, including Transformer paper coauthor Ashish Vaswani, who was struggling to raise funds for his startup. Vaswani will work on Nvidia's Nemotron open-source models.

0 favorites 0 likes

#nvidia

@arcinstitute: The future of biology is agentic. We're proud to work with NVIDIA on the Evo series of models and are excited to see th…

X AI KOLs Following ↗ · yesterday Cached

NVIDIA launches the BioNeMo Agent Toolkit, an open toolkit that enables AI agents to perform tasks like protein structure prediction, molecular docking, and generative chemistry, accelerating programmable biology in collaboration with Arc Institute.

0 favorites 0 likes

#nvidia

I'm eager for a 15x speedup on my strix halo

Reddit r/LocalLLaMA ↗ · yesterday

Nvidia claims a 15x speedup in text generation using a diffusion model, generating entire blocks at once.

0 favorites 0 likes

#nvidia

xAI posted 65 jobs for biology, physics, and chemistry tutors. Tracked hiring data across 8 AI labs to see what each one's actually building toward

Reddit r/ArtificialInteligence ↗ · yesterday

Analyzes hiring data across major AI labs to infer strategic directions, noting xAI's focus on scientific tutors, Nvidia's data center push, and OpenAI's engineering growth.

0 favorites 0 likes

#nvidia

UPDATE: Qwen-27B-IQ4_KS and Qwen-27B-IQ_KS_KT for ik_llama.cpp, especially for NVIDIA with 16GB VRAM

Reddit r/LocalLLaMA ↗ · yesterday

New GGUF quantizations of Qwen3.6-27B optimized for 16GB VRAM NVIDIA GPUs, including an experimental Trellis variant, with perplexity benchmarks.

0 favorites 0 likes

#nvidia

@aijoey: for all my new dgx spark owners. https://github.com/joeynyc/spark-doctor…

X AI KOLs Timeline ↗ · yesterday Cached

Spark Doctor is an open-source diagnostic CLI for NVIDIA DGX Spark that collects system, GPU, memory, Docker, and recipe data, applies specific rules, and outputs the likely cause and next steps for common issues.

0 favorites 0 likes

#nvidia

@PyTorch: While SGLang provided Day-0 support for DeepSeek-V4, the collaboration between the @lmsysorg and @NVIDIAAI engineering …

X AI KOLs Following ↗ · yesterday Cached

SGLang provided Day-0 support for DeepSeek-V4, and collaboration between LMSys and NVIDIA engineering teams achieved up to 5x throughput increase in production, with improvements shown on the SemiAnalysis InferenceX dashboard.

0 favorites 0 likes

#nvidia

How Businesses Are Building Specialized AI They Can Trust

NVIDIA Blog ↗ · yesterday Cached

NVIDIA introduces the Agent Toolkit, an open modular foundation with models, tools, skills, and a secure runtime to help businesses build specialized, trustworthy AI agents for various industries.

0 favorites 0 likes

#nvidia

Valve confirms it’s working with Intel and Nvidia on SteamOS for more GPUs

The Verge ↗ · yesterday Cached

Valve is working with Intel and Nvidia to expand SteamOS support to more GPUs and handhelds, with initial firmware for Intel handhelds and ongoing driver work for Nvidia.

0 favorites 0 likes

#nvidia

@DataChaz: @NVIDIA just quietly dropped an incredibly impressive speech recognition model that completely changes the math for loc…

X AI KOLs Timeline ↗ · yesterday Cached

NVIDIA quietly released Nemotron-3.5-ASR, a lightweight 0.6B parameter open-source speech recognition model designed for real-time streaming with support for 40+ languages, low latency, and cache-aware architecture.

0 favorites 0 likes

#nvidia

NVIDIA Powers Over 400 of the World’s 500 Fastest Supercomputers

NVIDIA Blog ↗ · yesterday Cached

NVIDIA technology now powers over 400 of the world's 500 fastest supercomputers (81% of the TOP500), with record GPU and networking adoption and top efficiency on the Green500 list.

0 favorites 0 likes

#nvidia

NVIDIA Brings Trusted, 24/7 AI Agents to Telecom Operations

NVIDIA Blog ↗ · yesterday Cached

NVIDIA announces new AI agents and tools for telecom operations, including synthetic data generation and secure agent runtimes, showcased at DTW Ignite 2026. The platform aims to enable autonomous networks by combining domain-specific models, privacy-safe synthetic data, and policy-based guardrails.

0 favorites 0 likes

#nvidia

@RayFernando1337: “The selected runtime uses NVFP4 weights for maximum performance. From the original FP8 weights, we performed an in-hou…

X AI KOLs Following ↗ · 2d ago

Discusses using NVFP4 4-bit floating point weights for maximum performance, achieved via in-house quantization from FP8 using NVIDIA ModelOpt, highlighting the data format's dual scale factors for high dynamic range.

0 favorites 0 likes

#nvidia

@philipkiely: https://x.com/philipkiely/status/2069212319746506968

X AI KOLs Timeline ↗ · 2d ago Cached

Baseten announces the world's fastest API for the GLM-5.2 open model, achieving over 280 tokens per second via NVFP4 quantization, disaggregated inference, and other optimizations.

0 favorites 0 likes

nvidia

Submit Feedback