@mervenoyann: DiffusionGemma is out it's compute-bound so 4x faster compared to other Gemma-4 models (1k tok/s on H100) also great on…

X AI KOLs Following 06/10/26, 04:55 PM Models

diffusion-gemma gemma-4 google compute-bound coding 3d-generation front-end

Summary

DiffusionGemma is out; it's compute-bound and 4x faster than other Gemma-4 models with 1k tok/s on H100, and excels at coding tasks including 3D generation and front-end.

DiffusionGemma is out 🔥 it's compute-bound so 4x faster compared to other Gemma-4 models (1k tok/s on H100) 💨 also great on coding, generate and iterate on any code from 3D generation to front-end ⤵️ https://t.co/NAjEaml6dV

Original Article

View Cached Full Text

Cached at: 06/10/26, 05:53 PM

DiffusionGemma is out 🔥

it’s compute-bound so 4x faster compared to other Gemma-4 models (1k tok/s on H100) 💨

also great on coding, generate and iterate on any code from 3D generation to front-end ⤵️ https://t.co/NAjEaml6dV

Similar Articles

DiffusionGemma under real workloads feels very different from benchmark demos

Reddit r/LocalLLaMA

Internal testing of DiffusionGemma reveals significant performance differences between H100 and A100 GPUs under real-world workloads, with H100s scaling much better under concurrency, and efficiency varying greatly depending on workload type, raising questions about benchmark reliability.

@_philschmid: Gemma goes diffusion! DiffusionGemma with up to 1000+ tokens per second! - Built on Gemma 4 as a 26B MoE model. - 3.8B …

X AI KOLs Following

DiffusionGemma, a 26B MoE model based on Gemma 4, achieves over 1000 tokens per second using diffusion for text generation in 256-token blocks, fitting in 18GB VRAM with quantization, released under Apache 2.0.

@mervenoyann: DiffusionGemma is out it's compute-bound so 4x faster compared to other Gemma-4 models (1k tok/s on H100) also great on…

Similar Articles

DiffusionGemma under real workloads feels very different from benchmark demos

@_philschmid: Gemma goes diffusion! DiffusionGemma with up to 1000+ tokens per second! - Built on Gemma 4 as a 26B MoE model. - 3.8B …

DifussionGemma 4 on 4x7900xtx

DiffusionGemma 26b on a 4090 at up to 475t/s... and some thoughts...

DiffusionGemma: 4x Faster Text Generation

Submit Feedback