diffusion-gemma

#diffusion-gemma

@vllm_project: Congrats to @GoogleDeepMind on DiffusionGemma A 26B diffusion language model on the Gemma4 backbone, and the first dLLM…

X AI KOLs Timeline ↗ · 8h ago Cached

vLLM announces native support for Google DeepMind's DiffusionGemma, a 26B discrete diffusion language model that generates 256-token blocks in parallel, enabling low-latency inference at 1200+ tok/s on a single H200.

0 favorites 0 likes

#diffusion-gemma

@mervenoyann: DiffusionGemma is out it's compute-bound so 4x faster compared to other Gemma-4 models (1k tok/s on H100) also great on…

X AI KOLs Following ↗ · 8h ago Cached

DiffusionGemma is out; it's compute-bound and 4x faster than other Gemma-4 models with 1k tok/s on H100, and excels at coding tasks including 3D generation and front-end.

0 favorites 0 likes

#diffusion-gemma

DiffusionGemma: 4x Faster Text Generation

Hacker News Top ↗ · 9h ago Cached

Google introduces DiffusionGemma, an experimental 26B MoE open model that achieves up to 4x faster text generation on GPUs using text diffusion, targeting speed-critical interactive local workflows.

0 favorites 0 likes

diffusion-gemma

@vllm_project: Congrats to @GoogleDeepMind on DiffusionGemma A 26B diffusion language model on the Gemma4 backbone, and the first dLLM…

@mervenoyann: DiffusionGemma is out it's compute-bound so 4x faster compared to other Gemma-4 models (1k tok/s on H100) also great on…

DiffusionGemma: 4x Faster Text Generation

Submit Feedback