decode-tps

#decode-tps

Weird to get near linear scaling by adding another GPU?

Reddit r/LocalLLaMA ↗ · 2026-06-08

A user reports near-linear performance scaling when adding a second RTX 3090 for inference with a Qwen model, achieving roughly 1.8x decode TPS improvement without NVLink.

0 favorites 0 likes

decode-tps

Weird to get near linear scaling by adding another GPU?

Submit Feedback