@rohanpaul_ai: NVIDIA just posted the first agentic AI benchmark results where GB300 NVL72 runs up to 20x more coding agents per megaw…
Summary
NVIDIA published the first agentic AI benchmark results showing the GB300 NVL72 can run up to 20x more coding agents per megawatt than the H200, using the AgentPerf benchmark from Artificial Analysis.
View Cached Full Text
Cached at: 06/13/26, 01:04 AM
NVIDIA just posted the first agentic AI benchmark results where GB300 NVL72 runs up to 20x more coding agents per megawatt than H200.
Older inference benchmarks mostly ask how fast a system can produce tokens after one prompt.
AgentPerf from Artificial Analysis, asks a harder question: how many agents can run at the same time while still feeling responsive.
It tests a harder workload than normal LLM serving because an agent is not one request and one answer, but a long chain of model calls, code edits, command runs, tool delays, and growing context.
The benchmark replays real coding-agent paths from public repos across 12+ programming languages, with request lengths from 5K to 131K tokens and an average near 27K tokens.
NVIDIA says GB300 NVL72 reaches 61.4K concurrent agents per megawatt at the lowest service tier, while H200 reaches 2.6K.
The gain comes from 72 GPUs acting like one rack-scale machine through NVLink, plus software that spreads MoE expert work, overlaps communication with compute, and keeps batches large.
@NVIDIAAIDev
Similar Articles
NVIDIA Blackwell Leads on First Agentic AI Infrastructure Benchmark
NVIDIA's Blackwell GB300 NVL72 platform leads the first agentic AI infrastructure benchmark, AgentPerf from Artificial Analysis, delivering up to 20x more agents per megawatt than the previous Hopper generation.
@mr_r0b0t: 16 local AI agents streaming at once! MiniMax M2.7 NVFP4 — 2x GB10, no cloud APIs.
A demonstration shows 16 local AI agents streaming simultaneously using MiniMax M2.7 NVFP4 on two Nvidia GB10 chips, with no cloud APIs required.
@Saboo_Shubham_: Everyone will run a team of AI Agents on their PC by end of 2026. NVIDIA RTX Spark with 128GB unified memory is built f…
The tweet predicts that by end of 2026 everyone will run AI agents on their PC, highlighting NVIDIA RTX Spark with 128GB unified memory designed for always-on local agents, and provides a guide for running local coding agents.
@nicos_ai: NVIDIA has just officially published the Skills they use for their AI agents. Right now they have Skills for: → analyzi…
NVIDIA has officially published a set of Skills for AI agents, covering video analysis, voice agents, LLM training, model acceleration, RAG, secure environments, logistics optimization, and CUDA programming.
Fastest, Largest, Strongest: NVIDIA Blackwell Sweeps MLPerf Training 6.0
NVIDIA's Blackwell platform achieved fastest training times across all MLPerf Training 6.0 benchmarks, scaling to 8,192 GPUs and showcasing up to 1.6x performance gains with the GB300 NVL72 over the GB200 NVL72.