DiffusionGemma

Simon Willison's Blog Models

Summary

Google released DiffusionGemma, an open-weight text generation model (26B parameters, 4B active) under Apache 2 license, demonstrating high inference speeds via NVIDIA's NIM cloud API.

No content available
Original Article
View Cached Full Text

Cached at: 06/10/26, 09:45 PM

# DiffusionGemma Source: [https://simonwillison.net/2026/Jun/10/diffusiongemma/](https://simonwillison.net/2026/Jun/10/diffusiongemma/) 10th June 2026 \- Link Blog **[DiffusionGemma](https://blog.google/innovation-and-ai/technology/developers-tools/diffusion-gemma-faster-text-generation/)**\([via](https://news.ycombinator.com/item?id=48478471)\) Last May Google briefly released an experimental Gemini Diffusion model\. I[tried the preview at the time](https://simonwillison.net/2025/May/21/gemini-diffusion/)and recorded it running at 857 tokens/second\. It was an exciting model, but Google made no further announcements about it\. That research has returned in the best possible way: as a new open weight \(Apache 2 licensed\) Gemma model,[google/diffusiongemma\-26B\-A4B\-it](https://huggingface.co/google/diffusiongemma-26B-A4B-it)\. NVIDIA are currently[hosting the model for free](https://build.nvidia.com/google/diffusiongemma-26b-a4b-it)on their NIM cloud API\. I used that API to[generate this pelican](https://tools.simonwillison.net/markdown-svg-renderer#url=https%3A%2F%2Fgist.github.com%2Fsimonw%2Fe5e234a6dc6eef61e209ce1629620042), which took 4\.4s \(according to`time uv run generate\.py`\) to return 2,409 tokens \- so at least 500 tokens/second\. ![Flat minimalist illustration of a white pelican with a large orange beak riding a red bicycle with black wheels, against a pale blue background with a green line representing the ground](https://static.simonwillison.net/static/2026/diffusiongemma-pelican.png) Posted[10th June 2026](https://simonwillison.net/2026/Jun/10/)at 8 pm ## Recent articles - [Initial impressions of Claude Fable 5](https://simonwillison.net/2026/Jun/9/claude-fable-5/)\- 9th June 2026 - [Running Python code in a sandbox with MicroPython and WASM](https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/)\- 6th June 2026 - [Claude Opus 4\.8: "a modest but tangible improvement"](https://simonwillison.net/2026/May/28/claude-opus-4-8/)\- 28th May 2026 This is a**link post**by Simon Willison, posted on[10th June 2026](https://simonwillison.net/2026/Jun/10/)\. [google412](https://simonwillison.net/tags/google/)[ai2,065](https://simonwillison.net/tags/ai/)[generative\-ai1,823](https://simonwillison.net/tags/generative-ai/)[llms1,791](https://simonwillison.net/tags/llms/)[nvidia18](https://simonwillison.net/tags/nvidia/)[pelican\-riding\-a\-bicycle118](https://simonwillison.net/tags/pelican-riding-a-bicycle/)[gemma15](https://simonwillison.net/tags/gemma/)[llm\-release205](https://simonwillison.net/tags/llm-release/)[llm\-performance16](https://simonwillison.net/tags/llm-performance/) ### Monthly briefing Sponsor me for**$10/month**and get a curated email digest of the month's most important LLM developments\. Pay me to send you less\! [Sponsor & subscribe](https://github.com/sponsors/simonw/)

Similar Articles

DiffusionGemma: 4x Faster Text Generation

Hacker News Top

Google introduces DiffusionGemma, an experimental 26B MoE open model that achieves up to 4x faster text generation on GPUs using text diffusion, targeting speed-critical interactive local workflows.

google/diffusiongemma-26B-A4B-it

Hugging Face Models Trending

Google DeepMind releases DiffusionGemma, a 26B-parameter Mixture-of-Experts model that uses discrete diffusion for faster text generation, supporting multimodal inputs and a 256K token context.

DiffusionGemma: The Developer Guide- Google Developers Blog

Reddit r/LocalLLaMA

DiffusionGemma is a new experimental model from Google DeepMind that uses parallel generation on a 256-token canvas, achieving up to 4x faster token generation on GPUs. This developer guide explains its architecture, bidirectional context, and includes a fine-tuning recipe for solving Sudoku.