glm-5.1

#glm-5.1

@mr_r0b0t: Official @NVIDIAAI GLM5.1-NVFP4 spotted on @huggingface

X AI KOLs Timeline ↗ · 2026-05-28 Cached

NVIDIA releases GLM-5.1-NVFP4, a quantized version of ZAI's GLM-5.1 model with 754B total parameters (40B activated), available on Hugging Face under MIT license.

0 favorites 0 likes

#glm-5.1

@0xSero: Finally GLM-5.1-505B-REAP-NVFP4 45 tokens/s decode 1350 tokens/s prefill 32% prune This was the hardest I ever worked t…

X AI KOLs Timeline ↗ · 2026-04-20 Cached

Developer @0xSero achieved high-performance inference on an optimized GLM-5.1-505B variant using NVFP4 quantization and 32% pruning, reaching 45 tokens/s decode and 1350 tokens/s prefill speeds.

0 favorites 0 likes

glm-5.1

@mr_r0b0t: Official @NVIDIAAI GLM5.1-NVFP4 spotted on @huggingface

@0xSero: Finally GLM-5.1-505B-REAP-NVFP4 45 tokens/s decode 1350 tokens/s prefill 32% prune This was the hardest I ever worked t…

Submit Feedback