Do you guys think subquadratic actually has a 12 million context model
Summary
Sub Quadratic claims to have a model with a context of 12 million tokens, but access is limited to partners; it performs well in the "needle in a haystack" test, but lacks evidence of general reasoning ability, raising doubts.
View Cached Full Text
Cached at: 06/18/26, 03:31 AM
Similar Articles
Does subquadratic's 12 million context model claim hold any water?
The video examines whether a claimed 12 million context model from subquadratic research is credible, analyzing its technical underpinnings and potential limitations.
Subquadratic AI introduces SubQ-1.1-Small, a new model using Smart Sparse Attention
Subquadratic AI introduces SubQ-1.1-Small, a model leveraging Smart Sparse Attention to achieve near-perfect long-context retrieval up to 12M tokens with up to 1,000x attention compute reduction. It balances long-context optimization with strong general reasoning, outperforming baselines on benchmarks like NIAH and RULER.
@Hesamation: Remember this? 20 days ago SubQ claimed to have developed a model with 12M context window, 95% cheaper than Opus, and t…
SubQ claimed a breakthrough model with a 12M context window and 95% cost reduction vs Opus, but after promising a paper and model card, they have not delivered, raising strong skepticism of a scam or shady behavior.
@sanbuphy: K2.6 successfully downloaded and deployed the Qwen3.5-0.8B model locally on a Mac, using the niche Zig language to implement and optimize inference, demonstrating the new model’s generalization ability. After 4,000+ tool calls and 12+ hours of continuous operation, K2.6 iterated 14 times…
K2.6 successfully downloaded and deployed the Qwen3.5-0.8B model locally on a Mac, using the niche Zig language to implement and optimize inference, demonstrating the new model’s generalization ability. After 4,000+ tool calls and 12+ hours of continuous operation, K2.6 iterated 14 times, boosting throughput from ~15 tokens/s to ~193 tokens/s, ultimately achieving 20% faster inference than LM Studio.
Deepseek V4's 1M context window: the breaking point
A detailed evaluation of Deepseek V4's 1M token context window across production codebases reveals optimal performance at 150-250k tokens, with degradation past 300k and significant latency in reasoning mode. The model exhibits high hallucination rates on unknown tasks, requiring validation layers for production use.