2x 512gb ram M3 Ultra mac studios

Reddit r/LocalLLaMA News

Summary

A user shares their $25k hardware setup of two 512GB RAM M3 Ultra Mac Studios for running large language models locally, having tested DeepSeek V3 Q8 and GLM 5.1 Q4 via the exo distributed inference backend, while awaiting Kimi 2.6 MLX optimization.

$25k in hardware. tell me what you want me to load on them and i'll help test. i've done deepseek v3.2 Q8 so far with exo backend. currently running GLM 5.1 Q4 on each (troubleshooting why exo isn't loading the Q8 version) patiently awaiting kimi2.6 for when the community optimizes it for MLX/mmap
Original Article

Similar Articles

Running local models on an M4 with 24GB memory

Hacker News Top

A guide on running local AI models like Qwen 3.5-9B on an M4 MacBook with 24GB RAM using tools like LM Studio, Ollama, and pi, including specific configuration tips for optimal performance.