Tag
A quantized 478B-parameter GLM-5.1 model runs on 4×RTX Pro 6000 GPUs via SGLang, delivering 370k-token context at up to 45 tok/s decode and 1340 tok/s prefill, and is demoed driving Figma.
Anthropic unveils a withheld Claude Mythos model that autonomously finds thousands of 0-days, ZAI open-sources the 1.5 TB GLM-5.1 that tops open-weight benchmarks, Alibaba’s unreleased HappyHorse video model hits #1 on public leaderboards, and Deepseek teases an “Expert Mode” v4 preview.