coding-models

#coding-models

@GergelyOrosz: This is from a popular inference provider GLM-5.2 plus the US banning the most capable new models means open source cau…

X AI KOLs Following ↗ · 4h ago Cached

GLM-5.2 is a new open-source coding model that has caught up to closed-source SOTA models, potentially disrupting revenues of OpenAI and Anthropic.

0 favorites 0 likes

#coding-models

DeepReinforce releases Ornith-1.0 open-source coding models (2 minute read)

TLDR AI ↗ · yesterday Cached

DeepReinforce open-sources Ornith-1.0, a family of self-improving coding models from 9B to 397B parameters, trained on Gemma 4 and Qwen 3.5 foundations, featuring a novel RL approach that learns to generate its own scaffolds.

0 favorites 0 likes

#coding-models

Gemma 4 beats Qwen 3.5 (UPDATE), and Qwen 3.6 27B + MiniMax M2.7 is the best OpenCode setup

Reddit r/LocalLLaMA ↗ · 2026-04-23

Personal benchmark shows Gemma-4E4B tops for routing, Qwen-3.6 27/30B beats Gemma-4 for coding, and MiniMax M2.7 MXFP4 replaces giant Qwen-3.5 quants in an OpenCode llama-swap workflow.

0 favorites 0 likes

#coding-models

Google ramps up agentic AI efforts amid pressure from Anthropic

Reddit r/singularity ↗ · 2026-04-20

Google has formed a dedicated strike team to improve its coding AI models, ramping up agentic AI efforts amid competitive pressure from Anthropic. This signals an intensifying race in AI coding capabilities between major AI labs.

0 favorites 0 likes

#coding-models

Why we no longer evaluate SWE-bench Verified

OpenAI Blog ↗ · 2026-02-23 Cached

OpenAI announces it will no longer report SWE-bench Verified scores, citing two critical issues: 59.4% of failed problems have flawed test cases that reject correct solutions, and frontier models have seen benchmark problems during training, making improvements reflect training data exposure rather than genuine capability gains.

0 favorites 0 likes

coding-models

@GergelyOrosz: This is from a popular inference provider GLM-5.2 plus the US banning the most capable new models means open source cau…

DeepReinforce releases Ornith-1.0 open-source coding models (2 minute read)

Gemma 4 beats Qwen 3.5 (UPDATE), and Qwen 3.6 27B + MiniMax M2.7 is the best OpenCode setup

Google ramps up agentic AI efforts amid pressure from Anthropic

Why we no longer evaluate SWE-bench Verified

Submit Feedback