Gemma 4 26B-A4B GGUF Benchmarks

Reddit r/LocalLLaMA 04/20/26, 02:50 PM Models

quantization gguf local-llm benchmarks gemma qwen open-source

Summary

Unsloth has released KL Divergence benchmarks for Gemma 4 26B-A4B GGUF quantizations, showing Unsloth GGUFs top 21 of 22 sizes on the Pareto frontier. They also introduced a new UD-IQ4_NL_XL quant fitting in 16GB VRAM and updated Q6_K and MLX quants for both Gemma 4 and Qwen3.6.

Hey r/LocalLLaMA we conducted KL Divergence benchmarks for Gemma 4 26B-A4B GGUFs across providers to help you pick the best quant. * Mean KL Divergence puts nearly all **Unsloth GGUFs on the Pareto frontier** * KLD shows how well a quantized model matches the original BF16 output distribution, indicating retained accuracy. * This makes Unsloth the **top-performing in 21 of 22 sizes.** Similar trend for 99.9% KLD and others. * We also updated our Q6\_K quants to be more dynamic. Previously, they were optimized, just now they're a bit better - no need to re-download though - it's up to you if you want a slightly better version. The previous quant was perfectly fine but this one is slightly bigger. The same was done for Qwen3.6. * We're also introducing a new UD-IQ4\_NL\_XL quant that fits in 16GB VRAM. UD-IQ4\_NL\_XL (14.6GB) sits between UD-IQ4\_XS (13.4GB) and UD-Q4\_K\_S (16.4GB). The same was done for Qwen3.6. For HQ versions of the graphs as Reddit mobile compresses it. See: [Gemma 4 Benchmarks](https://unsloth.ai/docs/models/gemma-4#unsloth-gguf-benchmarks) and [Qwen3.6 Benchmarks](https://unsloth.ai/docs/models/qwen3.6#unsloth-gguf-benchmarks) We also updated our MLX quants to be more dynamic with better layering selection (there are limitations due to MLX): [See here](https://unsloth.ai/docs/models/qwen3.6#mlx-dynamic-quants) |MLX Metrics|**UD-4bit (Old)**|**UD-4bit (New)**|**MLX 4.4bit MSQ**| |:-|:-|:-|:-| |Perplexity|4.772|**4.766**|4.864| |Mean KLD|0.0177|**0.0163**|0.0878| |99.9% KLD|0.8901|**0.8398**|2.9597| |Disk Sze|21.4 GB|21.6 GB|21.2 GB| Gemma 4 GGUFs: [https://huggingface.co/unsloth/gemma-4-26B-A4B-it-GGUF](https://huggingface.co/unsloth/gemma-4-26B-A4B-it-GGUF) Qwen3.6 GGUFs: [https://huggingface.co/unsloth/Qwen3.6-35B-A3B-GGUF](https://huggingface.co/unsloth/Qwen3.6-35B-A3B-GGUF)

Original Article

Gemma 4 26B-A4B GGUF Benchmarks

Similar Articles

unsloth/gemma-4-12B-it-qat-GGUF

unsloth/gemma-4-26B-A4B-it-GGUF

Layman's comparison on Qwen3.6 35b-a3b and Gemma4 26b-a4b-it

gemma-4-12b-it vs Qwen3.5-9B on shared benchmarks: Qwen is overall winner beating gemma in 5/8 benchmarks despite a smaller footprint

Personal Eval follow-up: Gemma4 26B MoE (Q8) vs Qwen3.5 27B Dense vs Gemma4 31B Dense Compared

Submit Feedback

Similar Articles

unsloth/gemma-4-12B-it-qat-GGUF

unsloth/gemma-4-26B-A4B-it-GGUF

Layman's comparison on Qwen3.6 35b-a3b and Gemma4 26b-a4b-it

gemma-4-12b-it vs Qwen3.5-9B on shared benchmarks: Qwen is overall winner beating gemma in 5/8 benchmarks despite a smaller footprint
Qwen3.5-9B outperforms gemma-4-12b-it on 5 of 8 benchmarks despite having a smaller footprint, with gemma only slightly better at coding.

Personal Eval follow-up: Gemma4 26B MoE (Q8) vs Qwen3.5 27B Dense vs Gemma4 31B Dense Compared