Youtuber tries Qwen 3.5 35B, Qwen 3.6 35B, and Gemma 4 27b to reverse engineer some large JS, with good results for Qwen 3.6

Reddit r/LocalLLaMA 04/22/26, 03:52 AM Models

Summary

Qwen 3.6 35B achieves near-perfect 283/285 line recall on a 108 k-token JS file, outperforming Gemma 4 27B (6/16 passes) and fixing long-context weaknesses of earlier Qwen versions.

Found this interesting and thought i'd share. A big problem i've had with Qwen 3 MoE is how bad at instruction following it was, and also, it's 'dumb point' in the context window was really low. I was so turned off by it that i never tried Qwen 3.5 and kept using SEED OSS 36B for coding. 3.6 appears to have better instruction following than prior models, do you find this to be the case yourself?

Original Article

View Cached Full Text

Cached at: 04/22/26, 05:10 AM

TL;DR: A head-to-head recall test on a 108 k-token JS file shows Qwen 3.6 35B remembering 283 of 285 target lines while Gemma 4 27B tops out at 6 of 16 tries, proving the new Qwen release fixes the “dumb point” that plagued earlier versions. ## The challenge: reverse-engineering 8000 lines of minified JavaScript The author needs a **local LLM** that can digest a 336 KB `service.js` (beautified to 108 k tokens) and extract the login-plus-API sequence for an LTE modem’s signal-strength scraper. The file contains 8 000+ lines of repetitive boiler-plate—ideal torture material for context-window stress testing. ## Test design: 16 spot-checks on exact line recall To avoid IDE helpers skewing results, a standalone client feeds the entire file plus a single prompt: “From the function starting at line X, quote the 20 lines that immediately follow its opening brace.” 1300 functions exist; 16 are sampled at random. A run is scored “pass” if ≥8 lines match the ground truth. All models use 8-bit KV-cache (Q8) to stay within 24 GB VRAM. ## Round 1 – Gemma 4 27B (A4B) ### Unsloth Q4K-XL - 6 / 16 passes - Giant return statements consistently truncated - Several commands silently dropped ### LM-Studio Q4KM - 2 / 16 passes - Same sliding-window 1 k-token limitation evident ## Round 2 – Qwen 3.5 35B (DeltaNet) ### LM-Studio community build - 11 / 16 passes - 245 correct lines recalled, 98 bonus lines also accurate, only 50 truncated - No failures on large return blocks ### Unsloth Q4KM - 10 / 16 passes - Slightly worse; confirms quantization choice matters ## Round 3 – Qwen 3.6 35B (A3B) ### LM-Studio - 15 / 16 perfect recalls, total miss count: 9 lines ### llama.cpp same quant - 283 / 285 lines exact, only 2 hallucinated - Effectively zero context degradation at 108 k tokens ## Take-away Gemma 4’s 1 k sliding-window attention makes long-file reverse engineering unreliable. Qwen 3.6 35B delivers **near-perfect positional recall** under the same memory budget, finally erasing the “dumb point” that discouraged many from the Qwen 3-series MoE models. Source: [YouTube – mr_zerolith](https://www.youtube.com/watch?v=ONQcX9s6_co)

Youtuber tries Qwen 3.5 35B, Qwen 3.6 35B, and Gemma 4 27b to reverse engineer some large JS, with good results for Qwen 3.6

Similar Articles

I tested Qwen3.6-27B, Qwen3.6-35B-A3B, Qwen3.5-27B and Gemma 4 on the same real architecture-writing task on an RTX 5090

Layman's comparison on Qwen3.6 35b-a3b and Gemma4 26b-a4b-it

Qwen 3.6 27B kick balls

Qwen 3.6 35B A3B vs Qwen 3.5 122B A10B

(Interactive)OpenCode Racing Game Comparison Qwen3.6 35B vs Qwen3.5 122B vs Qwen3.5 27B vs Qwen3.5 4B vs Gemma 4 31B vs Gemma 4 26B vs Qwen3 Coder Next vs GLM 4.7 Flash

Submit Feedback

Similar Articles

I tested Qwen3.6-27B, Qwen3.6-35B-A3B, Qwen3.5-27B and Gemma 4 on the same real architecture-writing task on an RTX 5090

Layman's comparison on Qwen3.6 35b-a3b and Gemma4 26b-a4b-it

Qwen 3.6 35B A3B vs Qwen 3.5 122B A10B

(Interactive)OpenCode Racing Game Comparison Qwen3.6 35B vs Qwen3.5 122B vs Qwen3.5 27B vs Qwen3.5 4B vs Gemma 4 31B vs Gemma 4 26B vs Qwen3 Coder Next vs GLM 4.7 Flash