Qwen 3.6 35B-A3B @ Q4 or Gemma 4 12B @ Q8?

Reddit r/LocalLLaMA 06/14/26, 09:30 PM News

quantization local-llm inference qwen gemma model-comparison

Summary

User asks for advice on choosing between quantized Qwen 3.6 35B-A3B at Q4 and Gemma 4 12B at Q8 for local codebase work on a 32GB unified memory setup.

Wondering how much model quantization matters here. Daily driver on my 32gb unified memory setup is the qwen model outputting ~15 tokens a second. Heard good things about the 12B Gemma 4 model so interested in trying it against my codebase. Given its size I can very comfortably fit the Q8 in. Hell, I could probably run it at BF16 lol

Original Article

Similar Articles

Layman's comparison on Qwen3.6 35b-a3b and Gemma4 26b-a4b-it

Reddit r/LocalLLaMA

A user compares Qwen3.6 35B-A3B and Gemma 4 26B-A4B-IT running locally on a 16GB VRAM GPU via LM Studio, finding Qwen3.6 produces more detailed outputs while both run at comparable speeds. The post is an informal community comparison using quantized models.

Gemma 4 beats Qwen 3.5 (UPDATE), and Qwen 3.6 27B + MiniMax M2.7 is the best OpenCode setup

Reddit r/LocalLLaMA

Personal benchmark shows Gemma-4E4B tops for routing, Qwen-3.6 27/30B beats Gemma-4 for coding, and MiniMax M2.7 MXFP4 replaces giant Qwen-3.5 quants in an OpenCode llama-swap workflow.

gemma-4-12b-it vs Qwen3.5-9B on shared benchmarks: Qwen is overall winner beating gemma in 5/8 benchmarks despite a smaller footprint

Reddit r/LocalLLaMA

Qwen3.5-9B outperforms gemma-4-12b-it on 5 of 8 benchmarks despite having a smaller footprint, with gemma only slightly better at coding.

Qwen 3.6 27B kick balls

Reddit r/LocalLLaMA

A user shares their positive experience using Qwen 3.6 27B locally for complex research and coding, finding it outperforms Gemini Pro in career advice and immigration research, while also noting performance issues with Gemma 4 31B.

Anyone use QwQ-32B? It's over a year old? Has Qwen 3.6 27b basically replaced it?

Reddit r/LocalLLaMA

A discussion on whether the older QwQ-32B model is still useful compared to newer alternatives like Qwen 3.6 27b and Gemma 4, particularly for coding tasks.

Similar Articles

Layman's comparison on Qwen3.6 35b-a3b and Gemma4 26b-a4b-it

Gemma 4 beats Qwen 3.5 (UPDATE), and Qwen 3.6 27B + MiniMax M2.7 is the best OpenCode setup

gemma-4-12b-it vs Qwen3.5-9B on shared benchmarks: Qwen is overall winner beating gemma in 5/8 benchmarks despite a smaller footprint

Qwen 3.6 27B kick balls

Anyone use QwQ-32B? It's over a year old? Has Qwen 3.6 27b basically replaced it?

Submit Feedback