New SOTA: Poetiq uses self-optimizing harness to surpass e.g. Opus 4.7 with Gemini 3 Flash
Summary
Poetiq claims new state-of-the-art coding performance using a self-optimizing harness with Gemini 3 Flash, surpassing Opus 4.7.
Similar Articles
@poetiq_ai: Poetiq's Meta-System built its own coding harness from scratch. It got SOTA on LiveCodeBench Pro. No fine-tuning, no sp…
Poetiq's Meta-System achieved state-of-the-art results on LiveCodeBench Pro by autonomously building a coding harness using standard APIs and Gemini 3.1 Pro, without fine-tuning or special model access.
Gemini 3.5 Flash Looks Good For How Fast It Is (8 minute read)
Google released Gemini 3.5 Flash, a hybrid speed model that rivals Opus 4.7 and GPT-5.5 in speed and cost while performing well on agentic and coding benchmarks.
Poetiq: Recursive Self-Improvement Delivers New SOTA Coding Performance
Poetiq's Meta-System, using recursive self-improvement via standard API access without fine-tuning, achieves new state-of-the-art results on the LiveCodeBench Pro coding benchmark, outperforming leading models like GPT 5.5.
Gemma4_31b_fp8 keeping up with Sonnet_4.6_medium in my harness.
A user reports that Gemma4_31b in FP8 matches or keeps up with Sonnet_4.6_medium in a custom harness across tasks like Cypher query generation, entity extraction, agentic tool calling, code writing, and multi-vector retrieval synthesis.
@nick_kango: One more task to add to my twitter benchmark collection:) Btw, Opus 4.8 and all the SOTA models passed when i tried tha…
Nick Kang adds a new task to his Twitter benchmark collection; Claude Opus 4.8 and other SOTA models pass, while Sonnet 4.6 and Grok 4.3 fail. Alfin remarks on Opus 4.8's dangerous capabilities.