rtx-pro

#rtx-pro

@0xSero: GLM-5.1-478B-NVFP4 Running on: - 4x RTX Pro 6000 - Sglang - 370,000 max tokens (1.75x full context) - p10 27.7 | p90 45…

X AI KOLs Timeline ↗ · 2026-04-21 Cached

A quantized 478B-parameter GLM-5.1 model runs on 4×RTX Pro 6000 GPUs via SGLang, delivering 370k-token context at up to 45 tok/s decode and 1340 tok/s prefill, and is demoed driving Figma.

0 favorites 0 likes

#rtx-pro

@Prince_Canuma: My home compute for MLX and research: • M3 Ultra — 512GB (sponsored by community + @wai_protocol) • RTX PRO 6000 — 96GB…

X AI KOLs Timeline ↗ · 2026-04-19

A researcher shares their home compute setup for MLX and AI research, featuring M3 Ultra with 512GB, RTX PRO 6000 with 96GB, and M3 Max with 96GB for model porting and stress testing.

0 favorites 0 likes

rtx-pro

@0xSero: GLM-5.1-478B-NVFP4 Running on: - 4x RTX Pro 6000 - Sglang - 370,000 max tokens (1.75x full context) - p10 27.7 | p90 45…

@Prince_Canuma: My home compute for MLX and research: • M3 Ultra — 512GB (sponsored by community + @wai_protocol) • RTX PRO 6000 — 96GB…

Submit Feedback