Layman's comparison on Qwen3.6 35b-a3b and Gemma4 26b-a4b-it

Reddit r/LocalLLaMA News

Summary

A user compares Qwen3.6 35B-A3B and Gemma 4 26B-A4B-IT running locally on a 16GB VRAM GPU via LM Studio, finding Qwen3.6 produces more detailed outputs while both run at comparable speeds. The post is an informal community comparison using quantized models.

Gemma 4 26b-a4b-it is basically a solid B student that gets the job done. Qwen3.6-35b-a3b is an A+ student that has plenty of energy after finishing the assignment to add flairs. On a my 16vram video card. Both models runs comparable speed. On Windows LM Studio using recommended inference settings. Model used: unsloth/gemma-4-26B-A4B-it-UD-Q4\_K\_S AesSedai/Qwen3.6-35B-A3B IQ4\_XS Any strong disagreements? **Edit:** Apparently I've been using Gemma 4 wrong. [Sadman782's comment](https://www.reddit.com/r/LocalLLaMA/comments/1sqxiz0/comment/ohb09kp/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button) and his system prompt really help unlock some of Gemma 4's potential!
Original Article

Similar Articles

Qwen 3.6 35B-A3B @ Q4 or Gemma 4 12B @ Q8?

Reddit r/LocalLLaMA

User asks for advice on choosing between quantized Qwen 3.6 35B-A3B at Q4 and Gemma 4 12B at Q8 for local codebase work on a 32GB unified memory setup.