Collected the infinity stones

Reddit r/LocalLLaMA News

Summary

A user proposes building a heterogeneous AI cluster using Blackwell GPUs and high-memory servers connected via RDMA, seeking collaboration on Tinygrad driver development.

2.3 TB of ram in here. 400+ vCores. All thats left is plugging it to the blackwell with the driver to do RDMA, and it’s over. Using Blackwells for prefill, RDMA to the studio mesh for decode. I think this would be the first heterogeneous cluster. I do, however, need help with the Tinygrad Driver to make this work. If anyone with any knowledge on these domains would like to collaborate, let me know via PM. We are very close here.
Original Article

Similar Articles

we really all are going to make it, aren't we? 2x3090 setup.

Reddit r/LocalLLaMA

A user shares their experience setting up a dual 3090 GPU system to run the Qwen 3.6 27b model locally, achieving over 100 tokens/second after switching to Ubuntu and using the club-3090 tool with custom patches. They express excitement about the future of local AI.