pcie

#pcie

PSA: DO NOT use Intel consumer platforms for multi-GPU setups

Reddit r/LocalLLaMA ↗ · 5d ago

Testing reveals that Intel consumer platforms like Z890 with Arrow Lake CPUs have hardware/firmware limitations that prevent proper PCIe Peer-to-Peer (P2P) communication between multiple GPUs, making them unsuitable for multi-GPU AI workloads despite adequate lane counts.

0 favorites 0 likes

#pcie

@superalesha: https://x.com/superalesha/status/2077437741915312221

X AI KOLs Timeline ↗ · 2026-07-15 Cached

The author shares six months of measurements on a four-RTX 3090 local setup, revealing that data parallelism often outperforms tensor parallelism for models fitting on fewer cards, with up to 3.4x throughput difference.

0 favorites 0 likes

#pcie

Measuring PCIe transfer under dual GPU with pipeline & tensor llama.cpp

Reddit r/LocalLLaMA ↗ · 2026-07-11

An analysis of PCIe transfer performance when running llama.cpp with dual GPUs using pipeline and tensor parallelism.

0 favorites 0 likes

#pcie

For dual GPUs, will there be any big impact to inference speeds when running in PCIe 5.0 x8/x4 vs x8/x8?

Reddit r/LocalLLaMA ↗ · 2026-06-26

A user asks whether running dual GPUs in PCIe 5.0 x8/x4 vs x8/x8 significantly impacts LLM inference speeds.

0 favorites 0 likes

#pcie

I accidentally crippled my 4x RTX 3090 LLM rig with a hidden PCIe 2.0 x4 slot and fixing it doubled Mistral 128B performance

Reddit r/LocalLLaMA ↗ · 2026-06-04

A user discovered that a hidden PCIe 2.0 x4 electrical limitation on a Threadripper workstation board was crippling one of four RTX 3090s, causing poor multi-GPU LLM inference performance. Fixing the slot layout and switching to tensor split mode doubled Mistral 128B throughput from ~11 to ~24.7 tok/s.

0 favorites 0 likes

#pcie

Project Blackwell: It Will Work, Eventually — Making an RTX Pro 6000 Run in a Dell R730 at 650K Context

Reddit r/LocalLLaMA ↗ · 2026-05-30

A developer documents the extensive hardware and firmware hacking required to run an NVIDIA RTX Pro 6000 Blackwell GPU in a legacy Dell PowerEdge R730 server, achieving 650K context length for local AI inference.

0 favorites 0 likes

#pcie

AMD to release slottable GPU

Reddit r/LocalLLaMA ↗ · 2026-05-07

AMD is set to release new slottable PCIe-based Instinct GPUs aimed at the enterprise AI market, offering a potential new hardware option for local LLM deployment.

0 favorites 0 likes

pcie

PSA: DO NOT use Intel consumer platforms for multi-GPU setups

@superalesha: https://x.com/superalesha/status/2077437741915312221

Measuring PCIe transfer under dual GPU with pipeline & tensor llama.cpp

For dual GPUs, will there be any big impact to inference speeds when running in PCIe 5.0 x8/x4 vs x8/x8?

I accidentally crippled my 4x RTX 3090 LLM rig with a hidden PCIe 2.0 x4 slot and fixing it doubled Mistral 128B performance

Project Blackwell: It Will Work, Eventually — Making an RTX Pro 6000 Run in a Dell R730 at 650K Context

AMD to release slottable GPU

Submit Feedback