@Ex0byt: Days of model activations, slicing, splicing, fine-tuning + 15 hours of nail-biting NVFP4 calibration/propagation passe…
Summary
A community member released Qwen3.6-35B-A3B-PRISM-NVFP4, a multi-pass, dataset-calibrated zero-loss NVFP4 quantized variant of the Qwen model.
View Cached Full Text
Cached at: 04/23/26, 12:06 PM
Days of model activations, slicing, splicing, fine-tuning + 15 hours of nail-biting NVFP4 calibration/propagation passes. I freely give you Qwen3.6-35B-A3B-PRISM-NVFP4 - Highest-quality multi-pass, 1024 custom dataset-calibrated, zero-loss NVFP4 (support for
Similar Articles
RedHatAI/Qwen3.6-35B-A3B-NVFP4
Red Hat AI released an NVFP4-quantized 35B MoE version of Qwen3.6 that retains 96.28% GSM8K accuracy while enabling 4-bit inference via vLLM.
nvidia/Qwen3.6-35B-A3B-NVFP4 · Hugging Face
NVIDIA releases Qwen3.6-35B-A3B-NVFP4, a quantized version of Alibaba's mixture-of-experts multimodal language model, optimized for deployment on NVIDIA GPUs using Model Optimizer.
@Ex0byt: And... Ladies and Gentlemen: Qwen3.6-27B-PRISM-PRO-DQ - enjoy!
Release of Qwen3.6-27B-PRISM-PRO-DQ, a dynamically quantized GGUF version of Qwen3.6-27B with bias/propaganda removal, preserving native MTP draft head and vision tower, enabling lossless speculative decoding for faster inference.
Qwen/Qwen3.6-27B-FP8
Alibaba releases Qwen3.6-27B-FP8, a 27B FP8-quantized model with strong agentic coding and reasoning benchmarks, now available on Hugging Face.
@no_stp_on_snek: After testing a replay of prompts from this model I can confidently say it's a viable replacement for one of my product…
A decensored version of Qwen3.6-35B-A3B, using the Heretic method with MPOA, achieving 88% fewer refusals while preserving model quality, is released as GGUF quantizations by llmfan46.