@PatrickToulme: I ran GLM 5.2 with OpenCode harness against Claude Opus this week deployed locally. Bottom line: It is a real frontier …
Summary
GLM 5.2 is a frontier open-source coding model that performs near Claude Opus quality on coding tasks, with excellent tool calling, planning, and local deployment capabilities, at no cost.
View Cached Full Text
Cached at: 06/22/26, 01:31 AM
I ran GLM 5.2 with OpenCode harness against Claude Opus this week deployed locally.
Bottom line: It is a real frontier coding model and insanely good for the price (free). Open source model + open source harness + local serving on my own chips is an amazing value proposition.
Some Notes:
-
Tool calling is very good — spun up nested subagents on its own, multiple levels deep
-
Very good at research and planning including long range plans
-
It built a cell based terminal renderer at near Opus quality. I still lean Claude, but most people couldn’t tell the outputs apart
-
Opus wins on oneshotting and reading my intent without me explicitly telling it
-
GLM 5.2 is more than enough intelligence for most F500 work IMO
-
GLM 5.2 is good enough to hill climb RL with and to drive further AI development / next generation GLM model. Progress will be much faster for their RL from here.
-
Running my own endpoint = permanent fast mode
-
It wastes thinking tokens writing code in the reasoning block
At this point I would consider GLM 5.2 a true frontier coding model. Getting to this point in coding quality was the hardest part IMO. They will progress quickly from here in RL.
Agreed. I can see a world in which some customers and enterprises still pay the closed source model premium for the max intelligence, but many enterprises who do not want to spend billions will offer employees self hosted open source models for low cost.
I served it on H100s. I did not measure tok/s but its faster than Claude generations Id say
I’m not a bot. I ran it on H100s
They wil hill climb with RL though. Getting GLM models to this point in agentic coding was the hardest part. It is exponential growth from here assuming they have enough compute.
There is a price quality curve. For example most F500s if they can get Opus tier for almost free will give that to most employees versus Claude Fable.
However there will still be some customers example being quant funds who need the absolute highest intelligence and will pay whatever Anthropic or OpenAI charge
I ran this on 8 H100s. Was fast but honestly still too slow. Will try 8 Blackwells soon.
I am trying to get it to run on TPUs right now as well.
H100. hopefully TPUs soon
Yes 100%
They will gather a large amount of trajectories and RL on positive ones. Just time + compute. There is no secret.
h100x8 did not measure token/s
Because GLM 5.2 has so many positive trajectories.
Similar Articles
@_MaxBlade: I CANNOT believe im saying this right now... but GLM 5.2 in open code is SHITTING on opus 4.8 in claude code. how is th…
A user claims that the open-source GLM 5.2 model outperforms Opus 4.8 in Claude Code for coding tasks, expressing disbelief.
GLM 5.2 vs. Opus
GLM 5.2 is a new open-weights model from Z.ai, compared against Claude Opus in a 3D game coding task. Opus performed faster and cleaner, but GLM 5.2 offers compelling cost and accessibility advantages.
@omarsar0: GLM-5.2 is great at design (Opus level IMO). I am also starting to see great results with long-running tasks, too. How …
GLM-5.2, an open-weight model with Opus-level design capabilities, incorporates an anti-hacking module trained via RL to mitigate reward hacking and improve performance on long-running tasks.
@AlexFinn: I can't believe this is real I have GLM 5.2 running 100% locally on my Mac Studio. 2 bit quant. The results I'm getting…
A user reports running GLM 5.2 locally on a Mac Studio with 2-bit quantization, claiming it outperforms Opus 4.8 and enables free, private superintelligence for coding and agent tasks.
@haider1: GLM 5.2 feels like the opus 4.5 moment for open-weight models what genuinely impressed me was during long, multi-step a…
GLM 5.2 marks a significant milestone for open-weight models, demonstrating strong context retention across long multi-step tasks and more reliable tool calling.