@AlexFinn: I can't believe this is real I have GLM 5.2 running 100% locally on my Mac Studio. 2 bit quant. The results I'm getting…
Summary
A user reports running GLM 5.2 locally on a Mac Studio with 2-bit quantization, claiming it outperforms Opus 4.8 and enables free, private superintelligence for coding and agent tasks.
View Cached Full Text
Cached at: 06/20/26, 02:36 PM
I can’t believe this is real
I have GLM 5.2 running 100% locally on my Mac Studio. 2 bit quant.
The results I’m getting are better than Opus 4.8
It’s now powering my Hermes Agent and Codex. 100% free, local, private super intelligence on my desk
I also have it in a loop coding for me 24/7 now
I thought we were at least a year away from this type of event. It happened today.
The model takes up about 250gb of memory. So you can technically run it on a Mac Studio with 256gb, but you probably want the 512gb memory version (please tell me you listened to me 5 months ago when these were sitting on store shelves)
With Fable gone, I now have Opus 4.8 level intelligence on my desk for free. This is the future.
Local, private, secure, personal super intelligence.
If you’re still writing off local AI as a fad or engagement bait, you are officially delusional
Similar Articles
GLM 5.2 on Mac Studio Speedup PR
GLM 5.2 delivers major performance gains on Mac Studio with 512GB RAM, achieving prefill speeds above 100 t/s at high context lengths and enabling 4-bit quantization for contexts over 100k tokens, as detailed in a pull request by the oMLX creator.
@pcuenq: GLM 5.2 has just been released Here it's already running with MLX on two Mac Studios (M3 Ultra). This is comparable to …
GLM 5.2, an open-weight AI model comparable to top closed models, has been released and is now running on MLX on two Mac Studios (M3 Ultra).
@UnslothAI: GLM-5.2 can now be run locally! The 2-bit model retains ~82% accuracy after we shrunk it from 1.51TB to 238GB (-84% siz…
UnslothAI announces GLM-5.2, Z.ai's strongest open model with 744B parameters, now runnable locally via dynamic GGUF quantization reducing size by ~84% to 239GB while retaining ~82% accuracy. It fits on 256GB Macs and supports long-context, reasoning, and agentic tasks.
@_MaxBlade: I CANNOT believe im saying this right now... but GLM 5.2 in open code is SHITTING on opus 4.8 in claude code. how is th…
A user claims that the open-source GLM 5.2 model outperforms Opus 4.8 in Claude Code for coding tasks, expressing disbelief.
Cheapest way to run GLM 5.x locally that's not a unified memory system?
A discussion on the cheapest local hardware setups for running GLM 5.x and similarly sized models at 4-bit quantization, including CPU-only and multi-GPU options, with a user sharing their experience running Minimax 2.7 and Qwen 3.6 on a 5900X + 128GB DDR4 + 7900XT setup.