rtx-4090

#rtx-4090

@totheagi: We're the first to make the full GLM-5.2 (FP8) run on RTX 4090s. GLM-5.2 is the new 753B SOTA open-weights model, and i…

X AI KOLs Timeline ↗ · yesterday Cached

We're the first to run the full GLM-5.2 (753B FP8) on RTX 4090s by porting sparse-attention kernels to Ada GPUs, enabling frontier open-weights model on commodity hardware.

0 favorites 0 likes

#rtx-4090

Any experience with modded 4090 48GB from GpuWorld.eu?

Reddit r/LocalLLaMA ↗ · 2026-05-16

A user seeks community feedback on purchasing a modded RTX 4090 with 48GB VRAM from GpuWorld.eu, asking for trustworthy sources and alternative sellers like Taobao.

0 favorites 0 likes

#rtx-4090

China modded GPU (eg. 4090 48gb) --> I'm gonna figure it out. IS THERE NO ONE ELSE CURIOUS??

Reddit r/LocalLLaMA ↗ · 2026-05-15

A Reddit user expresses curiosity about modded Chinese GPUs (e.g., 48GB RTX 4090) and seeks information on performance, reliability, and sourcing, proposing to form a research group.

0 favorites 0 likes

#rtx-4090

Got MTP + TurboQuant running — Qwen3.6-27B -- 80+ t/s at 262K context on a single RTX 4090

Reddit r/LocalLLaMA ↗ · 2026-05-08

Developer achieved 80+ t/s inference on Qwen3.6-27B with 262K context on a single RTX 4090 by combining MTP (Multi-Token Prediction) with TurboQuant's lossless KV cache compression, sharing their implementation fork and technical details.

1 favorites 1 likes

rtx-4090

@totheagi: We're the first to make the full GLM-5.2 (FP8) run on RTX 4090s. GLM-5.2 is the new 753B SOTA open-weights model, and i…

Any experience with modded 4090 48GB from GpuWorld.eu?

China modded GPU (eg. 4090 48gb) --> I'm gonna figure it out. IS THERE NO ONE ELSE CURIOUS??

Got MTP + TurboQuant running — Qwen3.6-27B -- 80+ t/s at 262K context on a single RTX 4090

Submit Feedback