@alex_whedon: Hey, folks! We have been blown away by the response to SubQ and the SSA breakthrough over the last 48 hours. It is awes…
Summary
The creator of SubQ announces an overwhelming response to the SSA breakthrough, with plans to release a model card with additional data and third-party validation next week.
Similar Articles
@seclink: Just hit 134 tok/s with Qwen 3.5-27B Dense and 73 tok/s with the new Qwen 3.6-27B on a single RTX 3090. The 2026 open-source scene is moving at lightspeed…
A single RTX 3090 pushes 134 tok/s on the fresh 27B Qwen 3.5 Dense and 73 tok/s on Qwen 3.6-27B via fused kernels plus speculative decoding, with GGUF drops the same evening.
Qwen 3.6 Max Preview just went live on the Qwen Chat website. It currently has the highest AA-Intelligence Index score among Chinese models (52) (Will it be open source?)
Qwen 3.6 Max Preview launched on Qwen Chat website, achieving the highest AA-Intelligence Index score (52) among Chinese models, with uncertainty about whether it will be open source.
@outsource_: NEW GLM+ QWEN 18B RUNS ON CONSUMER GPU IT BEATS 35B MoE AT HALF THE VRAM @KyleHessling1 just dropped the healed Qwopus-…
A new 18B merged quantized model, Qwopus-GLM-18B-GGUF, outperforms 35B MoE models while using half the VRAM and running on consumer GPUs.
@bastani_behnam: We just published how we unlocked +50% inference capacity on a 27B model — no new GPUs, no new nodes, at a fraction of …
OpenInfer demonstrates "vertical disaggregation" that boosts Qwen 3.5 27B throughput by ~50% by co-executing quantized layers across a single node’s AMD EPYC CPU and Nvidia L40S GPU with a custom SLA-aware scheduler.
Qwen-3.6-27B, llamacpp, speculative decoding - appreciation post
Reddit user demonstrates llamacpp speculative decoding boosting Qwen-3.6-27B token speed from 13.6 to 136.75 t/s, sharing exact commands and hardware setup.