@Hesamation: Remember this? 20 days ago SubQ claimed to have developed a model with 12M context window, 95% cheaper than Opus, and t…
Summary
SubQ claimed a breakthrough model with a 12M context window and 95% cost reduction vs Opus, but after promising a paper and model card, they have not delivered, raising strong skepticism of a scam or shady behavior.
View Cached Full Text
Cached at: 05/27/26, 07:20 AM
Remember this? 20 days ago SubQ claimed to have developed a model with 12M context window, 95% cheaper than Opus, and the same intelligence level.
they promised to release the paper and model card “next week”. that was 10 days ago. NOTHING.
the only update after this launch was a 3rd party eval by Appen, which mentioned it evaluated via Subquadratic API, and did not receive model weights, that’s normal, but not much of a proof.
if this is not a scam (which so far sets off every obvious red flags of being one) it’s super shady. you cannot make revolutionary research claims and still act like you’re running a sales pitch in a YC launch post.
the main problem is simple: they’re using a breakthrough claim to buy attention, credibility, and likely investor eyeballs before giving the community anything concrete to evaluate. very unprofessional compared to how an “ai lab” must conduct its research release.
Alexander Whedon (@alex_whedon): Introducing SubQ - a major breakthrough in LLM intelligence.
It is the first model built on a fully sub-quadratic sparse-attention architecture (SSA),
And the first frontier model with a 12 million token context window which is:
- 52x faster than FlashAttention at 1MM tokens
Similar Articles
@no_stp_on_snek: https://subq.mildlyconcerning.com
This article critically analyzes the claims and timeline of the subQ long-context AI technique, highlighting discrepancies and walkbacks from the original announcement.
Why is every "context layer" tool lying about token savings?
The author critiques the lack of transparent benchmarking in emerging context layer and MCP optimizer tools that promise drastic token savings, noting that real-world tests fail to replicate claimed efficiencies. They urge developers to demand open, reproducible benchmarks and ask for recommendations of tools that actually deliver measurable results.
I don’t believe this benchmark 27b size model next opus 4.5! Anyone can confirm testing with real agentic workflow?
A 27B parameter model reportedly outperforms Opus 4.5 on a benchmark, prompting community skepticism and requests for real-world agentic workflow validation.
@no_stp_on_snek: https://x.com/no_stp_on_snek/status/2052833502475833384
An open-source stack using Qwen2.5-32B-Instruct with longctx and vllm-turboquant on a single AMD MI300X achieves competitive results (0.601-0.688) versus SubQ's closed model (0.659) on the MRCR v2 1M-context benchmark, demonstrating open-weights approaches are within striking distance.
@outsource_: BREAKING QWOPUS 3.6 27B IS FULLY LIVE! SOTA QWEN 3.6 27b + Opus IS HERE!!!! Agentic coding GOATED: 75.25% (152/202) on …
Qwopus 3.6 27B is now fully live, a merged model (Qwen + Opus) achieving state-of-the-art agentic coding performance with 75.25% on SWE MMLU Pro, handling 303k token context at Q8 KV cache, and running on 24GB VRAM at Q5_K_M quantization.