Tag
GLM-5.2, an open-weight model with Opus-level design capabilities, incorporates an anti-hacking module trained via RL to mitigate reward hacking and improve performance on long-running tasks.