@populartourist: Having worked consistently with Qwen3.6 27B NVFP4 on repos - it's clear that this quant is not reliable, at least for c…
Summary
The user reports that the Qwen3.6 27B NVFP4 quantization is unreliable for coding, with inconsistent quality despite high throughput, and suggests that Q4_K_M may be more consistent.
View Cached Full Text
Cached at: 06/15/26, 09:09 PM
Having worked consistently with Qwen3.6 27B NVFP4 on repos - it’s clear that this quant is not reliable, at least for coding.
Quality is all over the place - results either really good or poor.
Throughput is amazing but it loses on debugging churn.
Ornestein Q6_K has been the dark horse so far.
I’m confident Q4_K_M can be more consistent than this NVFP4 variant.
wd 🔺 (@populartourist): Good quality Qwen3.6 27B NVFP4 grafted MTP, with image.
Fits RTX 5090 up to 180/190k context window with FP8 KV for max_num_seqs 1 - can fit more if you dont exhaust context in parallel.
Code work sees 2-8k tok/s throughput with peaks of +11k with MTP.
Initial quality looks
Similar Articles
Qwen3.6-27B Quantization Benchmark
This article benchmarks various Qwen3.6-27B quantizations (Q8 to Q2) using KLD and Same Top P metrics, comparing providers like Unsloth and mradermacher, and offers recommendations for quality-size trade-offs.
I can't get Qwen3.6 27B to outperform Qwen-Coder-Next and I'm not sure why
A user reports that Qwen-Coder-Next outperforms Qwen3.6 27B in both real-world tests and synthetic benchmarks, despite others praising 27B, and seeks advice on possible setup issues.
@Ex0byt: Days of model activations, slicing, splicing, fine-tuning + 15 hours of nail-biting NVFP4 calibration/propagation passe…
A community member released Qwen3.6-35B-A3B-PRISM-NVFP4, a multi-pass, dataset-calibrated zero-loss NVFP4 quantized variant of the Qwen model.
Need a second pair of eyes, this Qwen3.6 27B quant recipe consistently thinks less and is correct
The author shares a quantization recipe for Qwen3.6 27B that makes the model use significantly fewer thinking tokens while still producing correct answers, leading to faster inference on math benchmarks.
Qwen 3.6 35B A3B vs Qwen 3.5 122B A10B
User reports Qwen 3.5 122B significantly outperforms Qwen 3.6 35B on multi-step tasks despite benchmark claims, questioning if quantization or setup issues are to blame.