Why MOE below A10b feels like im gambling

Reddit r/LocalLLaMA 04/22/26, 04:31 AM Models

Summary

Developer reports that small-active-parameter MOE models like qwen3.6-35b-A3b exhibit lower coherence and require more guidance than dense qwen3.5-27b, making them hard to slot into agentic workflows.

We've seen lots of MOE's coming out recently. While these do phenominal work at speed you pay the price in coherence.. unless the MOE has at least 10b active-per-token. I often coded with these models and have been trying many different models the most recent i've found is: **qwen3-coder-next, qwen3.5-35b, qwen3.6-35b** and none of them come close to the level of stability i witnessed in qwen3.5-27b even qwen3.6-35b-A3b?? WhileThe A3b MOE can solve the problem he often needs hand-holding and multi-turn steering. the A3b often try to use tools avalible in the Coding Harness that doesn't apply to the problem hes trying to fix. so i often have to manually disable some tools to keep him focuses while the 27b would intuitively sucessfully ignore the irrelavent tools ETC. This is just one example. But the variability of what the model will chosse to do next is hugely varied with active 35b-A3b compared to 27b dense. I would like to use the MOE but im struggloing to find a usecase for where i would put it in my agentic workflow. Edit: english is hard. but u get what im saying? at least i'll leave the typos as proof this isnt a bot account. LOL

Original Article

Similar Articles

Qwen3.5-27B, Qwen3.5-122B, and Qwen3.6-35B on 4x RTX 3090 — MoEs struggle with strict global rules

Reddit r/LocalLLaMA

A user benchmarks three Qwen models (Qwen3.5-27B dense, Qwen3.5-122B-A10B MoE, Qwen3.6-35B-A3B MoE) on 4x RTX 3090 GPUs under real agentic workloads, finding that MoE models consistently underperform the dense 27B at following strict global rules despite speed advantages, with the Qwen3.6-35B leading in generation throughput.

Qwen-AgentWorld-35B-A3B: a 3B-active MoE trained to simulate MCP, terminal, SWE, Android, web and OS environments

Reddit r/LocalLLaMA

Qwen released Qwen-AgentWorld-35B-A3B, a 35B-parameter MoE model with 3B active parameters, designed as a language world model to simulate environment responses for agent interactions across seven domains including MCP, terminal, SWE, Android, web, and OS.

Forgive my ignorance but how is a 27B model better than 397B?

Reddit r/LocalLLaMA

User questions how Qwen's 27B dense model can outperform its 397B MoE variant, sparking discussion on MoE efficiency versus dense model quality.

Qwen/Qwen3.6-35B-A3B

Hugging Face Models Trending

Qwen releases Qwen3.6-35B-A3B, an open-weight Mixture-of-Experts model with 35B total parameters and 3B active parameters, featuring significant improvements in agentic coding and reasoning preservation.

@noctus91: I recently switched from Qwen 3.5 9B to LFM2.5-8B-A1B by @liquidai, and it's quickly become my default local model in H…

X AI KOLs Timeline

A user shares their positive experience switching from Qwen 3.5 9B to Liquid AI's new LFM2.5-8B-A1B model, praising its speed and reliability for agentic tasks while noting coding remains a weakness. The model is an 8B MoE with 1.5B active parameters and 128K context, optimized for devices and server-side use.

Similar Articles

Qwen3.5-27B, Qwen3.5-122B, and Qwen3.6-35B on 4x RTX 3090 — MoEs struggle with strict global rules

Qwen-AgentWorld-35B-A3B: a 3B-active MoE trained to simulate MCP, terminal, SWE, Android, web and OS environments

Forgive my ignorance but how is a 27B model better than 397B?

Qwen/Qwen3.6-35B-A3B

@noctus91: I recently switched from Qwen 3.5 9B to LFM2.5-8B-A1B by @liquidai, and it's quickly become my default local model in H…

Submit Feedback