A user shares their positive experience using Qwen 3.6 27B locally for complex research and coding, finding it outperforms Gemini Pro in career advice and immigration research, while also noting performance issues with Gemma 4 31B.
This is more of a quick appreciation post for Qwen 3.6 27B running locally (8-bit unsloth quant). I've been using it mainly alongside my 35B model in OpenCode for planning and coding. I also had it set up in Open WebUI, but until MTP support came about two weeks ago in llama.cpp, the TPS was so painfully slow on OWUI that it was basically unusable for chat. Since then, I paired them together and have been using Qwen 27B as a daily chat assistant alongside Gemini Pro. I've been keeping a running mental comparison between the two. For straightforward questions, Gemini handles things fine. But over the weekend I dove into some career advice and company portfolio deep dives, plus some immigration research. Gemini completely fell apart on this. It started hallucinating and fixating on stuff based on earlier messages in the conversation and my previous chats. I think this degradation have started to happen over last couple of weeks or so, wanted to know others experience with gemini lately. I ended up doing a lot of manual research myself. Then I decided to try same research with Qwen 3.6 27B. I was genuinely surprised by how much better it performed on both the career/company stuff and the immigration research. The immigration results really stood out because it had to actually go through official documentation and make sense of it rather than just regurgitating something. Side note: I've also tried Gemma 4 31B, which I heard is great for research and planning, but it's just too slow on my M5 Max with 128GB with 8 bit quant. Curious to know folks opinion here on that and maybe once MTP is enabled for that I will try it.
User reports positive experience with Qwen 35b a3b for agentic coding tasks, noting it outperforms Gemma4 26b in their use case and works well for demo/data analytics, especially in agentic mode versus chat.
The author benchmarks small local LLMs, highlighting Qwen 3.6 35B A3B for its superior ability to map academic code to research papers compared to models like Gemma 4 and Nemotron 3 Nano.
A discussion on whether the older QwQ-32B model is still useful compared to newer alternatives like Qwen 3.6 27b and Gemma 4, particularly for coding tasks.
User reports Qwen 3.5 122B significantly outperforms Qwen 3.6 35B on multi-step tasks despite benchmark claims, questioning if quantization or setup issues are to blame.