Qwen3.6 35B A3B uncensored heretic Native MTP Preserved is Out Now With KLD 0.0015, 10/100 Refusals and the Full 19 MTPs Preserved and Retained, Available in Safetensors, GGUFs. NVFP4, NVFP4 GGUFs and GPTQ-Int4 Formats
Community release of Qwen3.6 35B A3B uncensored variant with full 19 MTP tensors preserved, available in multiple formats including Safetensors, GGUF, NVFP4 and GPTQ-Int4.
llmfan46/Qwen3.6-35B-A3B-uncensored-heretic-Native-MTP-Preserved: [https://huggingface.co/llmfan46/Qwen3.6-35B-A3B-uncensored-heretic-Native-MTP-Preserved](https://huggingface.co/llmfan46/Qwen3.6-35B-A3B-uncensored-heretic-Native-MTP-Preserved) llmfan46/Qwen3.6-35B-A3B-uncensored-heretic-Native-MTP-Preserved-GGUF: [https://huggingface.co/llmfan46/Qwen3.6-35B-A3B-uncensored-heretic-Native-MTP-Preserved-GGUF](https://huggingface.co/llmfan46/Qwen3.6-27B-uncensored-heretic-v2-Native-MTP-Preserved-GGUF) llmfan46/Qwen3.6-35B-A3B-uncensored-heretic-Native-MTP-Preserved-NVFP4-Experts-Only-GGUF: [https://huggingface.co/llmfan46/Qwen3.6-35B-A3B-uncensored-heretic-Native-MTP-Preserved-NVFP4-Experts-Only-GGUF](https://huggingface.co/llmfan46/Qwen3.6-35B-A3B-uncensored-heretic-Native-MTP-Preserved-NVFP4-Experts-Only-GGUF) llmfan46/Qwen3.6-35B-A3B-uncensored-heretic-Native-MTP-Preserved-NVFP4-Experts-Only: [https://huggingface.co/llmfan46/Qwen3.6-35B-A3B-uncensored-heretic-Native-MTP-Preserved-NVFP4-Experts-Only](https://huggingface.co/llmfan46/Qwen3.6-35B-A3B-uncensored-heretic-Native-MTP-Preserved-NVFP4-Experts-Only) llmfan46/Qwen3.6-35B-A3B-uncensored-heretic-Native-MTP-Preserved-GPTQ-Int4: [https://huggingface.co/llmfan46/Qwen3.6-35B-A3B-uncensored-heretic-Native-MTP-Preserved-GPTQ-Int4](https://huggingface.co/llmfan46/Qwen3.6-35B-A3B-uncensored-heretic-Native-MTP-Preserved-GPTQ-Int4) People asked for it, so here it is, all realeases are confirmed to have their full MTP count\* retained and preserved. Comes with benchmark too. Find all my models here: [HuggingFace-LLMFan46](https://huggingface.co/llmfan46/models) \*All releases have been verified to retain the full MTP tensors. In safetensors format, the Qwen3.6-35B-A3B MTP tensors appear as 19 entries because \`gate\_up\_proj\` is stored as one fused tensor. In GGUF format, that fused tensor is split into separate gate/up expert tensors, so the same MTP component appears as 20 entries. The count differs by format, but the MTP tensors are preserved.
ByteShape releases Qwen 3.6 35B GGUF quantizations in NTP and MTP variants with detailed benchmarking across multiple GPUs and CPUs, finding that larger quants often outperform smaller ones and MTP provides GPU speed boosts at the cost of memory.
Unsloth has released an optimized GGUF version of the Qwen3.6-27B MTP model, achieving significantly faster inference speeds (up to 114 tok/s on an RTX 5090) compared to previous quantizations.
Release of Qwen3.6-27B-PRISM-PRO-DQ, a dynamically quantized GGUF version of Qwen3.6-27B with bias/propaganda removal, preserving native MTP draft head and vision tower, enabling lossless speculative decoding for faster inference.