Anybody else noticing how good gemma-4-26b-a4b is with one-shotting three.js?

Reddit r/LocalLLaMA News

Summary

A discussion highlighting the capability of the Gemma-4-26b-a4b model to generate Three.js code for generative art demos using one-shot prompting.

I wrote up this little python app to cycle through a bunch of prompts like this: |Single HTML file using three.js from CDN. A central rotating MeshNormalMaterial torus knot. Place a bright Sprite (AdditiveBlending, soft circular canvas texture) at a position projected to screen, and 6 smaller sprites along the line from that position to screen center, each with different sizes/tints. Update positions each frame.| |:-| I have a .csv in there file with 80 or so of these little prompts to cycle through - It writes the code into a mock terminal window, detects a crash if needed, and then shows and archives the finished hmtl file. Really fun to mess around with. Link above is to a static demo - github page is here [https://github.com/RowanUnderwood/auto\_demo\_scener](https://github.com/RowanUnderwood/auto_demo_scener) No cherry picking here so there may be a few dead ones slipped into the archive :D
Original Article
View Cached Full Text

Cached at: 05/10/26, 06:20 PM

# AI Demoscener — Archive Source: [https://rowanunderwood.github.io/auto_demo_scener/](https://rowanunderwood.github.io/auto_demo_scener/) DEMOWRITER 1\.0 — ARCHIVE — / — LOADING INITIALISING← prev · → / space next

Similar Articles

Gemma 4 12B is my new main squeeze

Reddit r/LocalLLaMA

The author shares their experience switching from Qwen 3.6 to Gemma 4 12B (Unsloth Q5_K_XL) for local coding, praising its plug-and-play setup, better syntax accuracy, and manageable VRAM usage despite a slight speed trade-off.

Gemma 4 26B Hits 600 Tok/s on One RTX 5090

Reddit r/LocalLLaMA

A benchmark shows that using vLLM with DFlash speculative decoding boosts Gemma 4 26B inference to ~578 tokens per second on a single RTX 5090, achieving a 2.56x speedup over baseline.

google/gemma-4-26B-A4B-it-assistant

Hugging Face Models Trending

Google DeepMind released Gemma 4 MTP drafters for the Gemma 4 family, enabling significant decoding speedups via speculative decoding while maintaining exact generation quality for low-latency applications.