Qwen3.6-35B-A3B-Plus-Uncensored-Wasserstein (neuron level surgery)

Reddit r/LocalLLaMA Models

Summary

Community member repaired dead neurons in Qwen3.6-35B-A3B MoE by copying weights from healthy neighbors, releasing fixed GGUF and FP8 safetensors versions.

Hello everyone. During data debugging session on per tensor and per neuron level I found that neurons in tensor layers in MoE model can die (have zero value). [Here the log.](https://huggingface.co/LuffyTheFox/Qwen3.6-35B-A3B-Plus-Uncensored-Wasserstein-Safetensors/raw/main/Qwen3.6-35B-A3B-Plus-Uncensored-fp8_e4m3fn.txt) *For example In blk.0.ffn\_gate\_exps.weight and blk.0.ffn\_up\_exps.weight in Qwen3.6 35B A3B Q8\_0 quant:* *I found* ***40% of zero neurons.*** In Qwen3.5 9B I didn't found any zero blocks. All blocks in it contain value. *Don't know why this is happened. I never trained LLM's by myself, but this problem exists. A company I'm interviewing with independently confirmed these findings using different detection methods. But I think this is the main reason why LLM degrade during training.* I fixed the model as much I can on Google Collab Free Tier CPU on binary level. And restored dead neurons `(7.5 million zero blocks in Q8 quant)` in tensors via copy/pasting binary weight data from healthy neighbour neurons to dead neurons + linear interpolation. Here fixed GGUF model: [https://huggingface.co/LuffyTheFox/Qwen3.6-35B-A3B-Plus-Uncensored-Wasserstein-GGUF](https://huggingface.co/LuffyTheFox/Qwen3.6-35B-A3B-Plus-Uncensored-Wasserstein-GGUF) And .safetensors fp8\_e4m3fn version: [https://huggingface.co/LuffyTheFox/Qwen3.6-35B-A3B-Plus-Uncensored-Wasserstein-Safetensors](https://huggingface.co/LuffyTheFox/Qwen3.6-35B-A3B-Plus-Uncensored-Wasserstein-Safetensors) I converted Q8\_0 to .safetensors via this script: [https://huggingface.co/LuffyTheFox/Qwen3.6-35B-A3B-Plus-Uncensored-Wasserstein-Safetensors/raw/main/gguf\_to\_safetensors.py](https://huggingface.co/LuffyTheFox/Qwen3.6-35B-A3B-Plus-Uncensored-Wasserstein-Safetensors/raw/main/gguf_to_safetensors.py) *FP8 version uncensored version in .safetensors is trainable - gradients are alive in it without zeros.* Model is based on this one: [https://huggingface.co/HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive](https://huggingface.co/HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive) . Thanks to [HauhauCS](https://huggingface.co/HauhauCS) for amazing job. System prompt: [https://pastebin.com/pU25DVnB](https://pastebin.com/pU25DVnB) Chat template: [https://pastebin.com/Dy2fmmpN](https://pastebin.com/Dy2fmmpN) Recommended quants: `MXFP4_MOE and Q8_0` **Recommended Settings (LM Studio):** |Parameter|Value| |:-|:-| |Temperature|0.7| |Top K Sampling|20| |Presence Penalty|1.5| |Repeat Penalty|Disabled| |Top P Sampling|0.8| |Min P Sampling|0| |Seed|42| Enjoy \^\_\^ PS: Qwen team released 3.6 27B version. I can't use it on my RTX 3060 12GB, but I will heal it for community and release after HauhauCS 27B uncensored release.
Original Article

Similar Articles

Qwen3.6-27B-GGUF is here!

Reddit r/LocalLLaMA

Community GGUF release of Qwen’s 27B hybrid-architecture model with 262k context, multimodal inputs, tool calling and "Thinking Preservation" for agentic coding.

Qwen/Qwen3.6-35B-A3B-FP8

Hugging Face Models Trending

Alibaba releases Qwen3.6-35B-A3B-FP8, an open-weight quantized variant of Qwen3.6 with 35B parameters and 3B activated via MoE, featuring improved agentic coding capabilities and thinking preservation for iterative development.

Jackrong/Qwopus-GLM-18B-Merged-GGUF

Hugging Face Models Trending

Jackrong released Qwopus-GLM-18B-Merged-GGUF, a 64-layer frankenmerge combining two Qwen3.5-9B finetunes into an ~18B parameter model, healed with 1000-step LoRA fine-tuning to fix layer boundary issues. The model achieves 90.9% on capability benchmarks while using less than half the VRAM of Qwen 3.6-35B MoE.