Tag
The user reports that the Qwen3.6 27B NVFP4 quantization is unreliable for coding, with inconsistent quality despite high throughput, and suggests that Q4_K_M may be more consistent.
Unsloth has released an optimized GGUF version of the Qwen3.6-27B MTP model, achieving significantly faster inference speeds (up to 114 tok/s on an RTX 5090) compared to previous quantizations.