2x 512gb ram M3 Ultra mac studios
Summary
A user shares their $25k hardware setup of two 512GB RAM M3 Ultra Mac Studios for running large language models locally, having tested DeepSeek V3 Q8 and GLM 5.1 Q4 via the exo distributed inference backend, while awaiting Kimi 2.6 MLX optimization.
Similar Articles
@Prince_Canuma: My home compute for MLX and research: • M3 Ultra — 512GB (sponsored by community + @wai_protocol) • RTX PRO 6000 — 96GB…
A researcher shares their home compute setup for MLX and AI research, featuring M3 Ultra with 512GB, RTX PRO 6000 with 96GB, and M3 Max with 96GB for model porting and stress testing.
@tom_doerr: Runs 35B models on 16GB RAM Macs https://github.com/walter-grace/mac-code…
A tool that enables running large language models like Qwen3.5-35B on 16GB Macs by streaming model weights from SSD, achieving up to 30 tok/s with an optimal configuration.
@antirez: DeepSeek v4 PRO running via SSD streaming on my 128GB MacBook m5 max. 1.6 trillion parameters.
DeepSeek v4 PRO, a 1.6 trillion parameter model, is running via SSD streaming on a 128GB MacBook m5 max, demonstrating local inference of a massive model.
Running local models on an M4 with 24GB memory
A guide on running local AI models like Qwen 3.5-9B on an M4 MacBook with 24GB RAM using tools like LM Studio, Ollama, and pi, including specific configuration tips for optimal performance.
@remilouf: Following @julien_c’s tweet I bought a MacBook Pro with 128B unified memory, and started running Qwen3.6 as my daily dr…
The author shares their experience running the Qwen3.6 model on a MacBook Pro with 128GB of unified memory, praising Apple's hardware efficiency for local AI inference.