@0xSero: Locally Part 1 - Apple Silicon Macs give you large pools of memory to run big models, but the token generation speed wi…

X AI KOLs Following Tools

Summary

Apple Silicon Macs offer large memory pools for running big models but with slower token generation, performing best with large MoEs that have low active parameters.

Locally Part 1 - Apple Silicon Macs give you large pools of memory to run big models, but the token generation speed will be lower than most are used to. Macs are best with large MoEs that have low ACTIVE params. Basically when you see a model like Qwen3.5-397B-A17B this
Original Article
View Cached Full Text

Cached at: 04/22/26, 03:00 PM

Locally Part 1 - Apple Silicon Macs give you large pools of memory to run big models, but the token generation speed will be lower than most are used to. Macs are best with large MoEs that have low ACTIVE params. Basically when you see a model like Qwen3.5-397B-A17B this

Similar Articles

@sitinme: There's a pretty interesting open-source project called Cider, specifically designed to accelerate local AI inference on Macs with Apple Silicon chips. Many people buy a Mac mini or MacBook Pro and want to run models locally, but often encounter issues like insufficient speed and high memory usage. Actually...

X AI KOLs Timeline

Cider is an open-source project designed for Apple Silicon Macs, accelerating local AI inference by fully leveraging the computing power of M-series chips. It is compatible with the MLX ecosystem, supports models like Qwen and Llama, and is easy to install.