@spiritbuun: My first ever quant is going to drop in the next week. I've been working on this for over a month now. The recipe is lo…
Summary
Announcement of an upcoming release of a quantized version of the B27 model using quantization-aware training (QAT), described as the smartest B27 yet.
View Cached Full Text
Cached at: 06/02/26, 11:39 PM
My first ever quant is going to drop in the next week. I’ve been working on this for over a month now. The recipe is locked in. The QAT is finished. This is not just quantization. This is the smartest B27 ever. https://t.co/4t4a4GYHTS
Similar Articles
Need a second pair of eyes, this Qwen3.6 27B quant recipe consistently thinks less and is correct
The author shares a quantization recipe for Qwen3.6 27B that makes the model use significantly fewer thinking tokens while still producing correct answers, leading to faster inference on math benchmarks.
@Ex0byt: Days of model activations, slicing, splicing, fine-tuning + 15 hours of nail-biting NVFP4 calibration/propagation passe…
A community member released Qwen3.6-35B-A3B-PRISM-NVFP4, a multi-pass, dataset-calibrated zero-loss NVFP4 quantized variant of the Qwen model.
@coffeecup2020: TurboQuant - Qwopus3.6-27B-v2-TQ3_4S.gguf Confirmed with gpqa test this is something great. https://huggingface.co/YTan…
TurboQuant is a GGUF quantized version of the Qwopus3.6-27B-v2 model, confirmed with GPQA test results and shared on Hugging Face, with credits to Jackrong and KyleHessling.
@norpadon: We are releasing our first quantized checkpoints for the Qwen3.5 series of models, co-designed jointly with our inferen…
Mirai Labs releases quantized checkpoints for Qwen3.5 models using their new Mirai quantization method, achieving 40-60% faster inference on Apple hardware while maintaining accuracy comparable to full-precision models.
@anirudhbv_ce: Introducing SpectralQuant.. here to save your KV cache :)
SpectralQuant is a new KV cache quantization technique achieving 5.95× compression on Mistral 7B with only 7.5% perplexity overhead, significantly outperforming TurboQuant while requiring only 15 seconds of calibration per model.