Tag
The author shares their experience running Qwen3.6 35B-A3B locally on an ASUS Zenbook Pro 14, achieving 27 TPS at 32k context, marking a personal milestone towards fully local AI for privacy.
Opus 4.7 auto-generated a custom WebGPU kernel that accelerates Qwen3.5 inference up to 13× via fused LinearAttention, now shipping in Transformers.js v4.2.0.