Qwen3.6-35B-A3B-Abliterated-Heretic-MLX-4bit

Reddit r/LocalLLaMA Models

Summary

The user reviews a quantized and fine-tuned version of the Qwen3.6-35B model optimized for Apple Silicon via MLX, praising its speed, intelligence, and lack of safety disclaimers.

This model is the GOAT of general chatbot models. Whip-smart, lightning fast (Apple silicon), and tells the truth with no disclaimers. If it only gets better from here, I am absolutely gobsmacked. Gobsmacked.
Original Article

Similar Articles

Qwen/Qwen3.6-35B-A3B-FP8

Hugging Face Models Trending

Alibaba releases Qwen3.6-35B-A3B-FP8, an open-weight quantized variant of Qwen3.6 with 35B parameters and 3B activated via MoE, featuring improved agentic coding capabilities and thinking preservation for iterative development.

Qwen 3.6 35B A3B vs Qwen 3.5 122B A10B

Reddit r/LocalLLaMA

User reports Qwen 3.5 122B significantly outperforms Qwen 3.6 35B on multi-step tasks despite benchmark claims, questioning if quantization or setup issues are to blame.

I benchmarked 21 local LLMs on a MacBook Air M5 for code quality AND speed

Reddit r/LocalLLaMA

A developer benchmarked 21 local LLMs on MacBook Air M5 using HumanEval+ and found Qwen 3.6 35B-A3B (MoE) leads at 89.6% with 16.9 tok/s, while Qwen 2.5 Coder 7B offers the best RAM-to-performance ratio at 84.2% in 4.5 GB. Notably, Gemma 4 models significantly underperformed expectations (31.1% for 31B), possibly due to Q4_K_M quantization effects.