MAX models can now run on Apple silicon GPUs

Lobsters Hottest 06/28/26, 09:21 AM Models

apple-silicon gpu max-models compatibility ai-models machine-learning

Summary

MAX models have been updated to run on Apple silicon GPUs, enabling faster inference on Macs.

<p><a href="https://lobste.rs/s/4srepl/max_models_can_now_run_on_apple_silicon">Comments</a></p>

Original Article

Similar Articles

Apple announced new on device inference engine for Apple Silicon

Reddit r/LocalLLaMA

Apple announced CoreAI, a new on-device inference engine for Apple Silicon at WWDC, replacing CoreML and supporting larger models up to 20B parameters via optimized inference, with a focus on phones and tablets.

@akshay_pachaar: Apple finally did it. Its new framework, Core AI, runs models entirely on Apple silicon, so inference happens on the us…

X AI KOLs Following

Apple released Core AI, a new framework that runs AI models entirely on Apple silicon devices (iPhone, iPad, Mac, Vision Pro) with zero server calls. It includes a memory-safe Swift API, model export recipes for PyTorch, an optimizer, and debugging tools, supporting models like Qwen, Mistral, and SAM3.

@neural_avb: I am working on porting SAM models and harness into Apple silicon. Already seeing 1.25x inference speed increase on mlx…

X AI KOLs Following

Porting SAM 2.1 models to Apple silicon with MLX, achieving 1.25x inference speed increase on the small model, with quantized versions planned.

@PyTorch: ExecuTorch now has an MLX delegate that runs PyTorch models on Apple Silicon GPUs. It supports LLMs, speech-to-text, an…

X AI KOLs Following

ExecuTorch now has an MLX delegate that enables GPU-accelerated inference for PyTorch models on Apple Silicon Macs, supporting LLMs, speech-to-text, and MoE models with quantization via TorchAO.

@HuggingModels: Gemma 4 is here, and it's optimized for Apple Silicon. This 4-bit quantized model runs fast on your Mac, not just in th…

X AI KOLs Timeline

Gemma 4 is a 4-bit quantized model optimized for Apple Silicon, enabling fast local inference on Mac devices, reducing reliance on cloud computing.

Similar Articles

Apple announced new on device inference engine for Apple Silicon

@akshay_pachaar: Apple finally did it. Its new framework, Core AI, runs models entirely on Apple silicon, so inference happens on the us…

@neural_avb: I am working on porting SAM models and harness into Apple silicon. Already seeing 1.25x inference speed increase on mlx…

@PyTorch: ExecuTorch now has an MLX delegate that runs PyTorch models on Apple Silicon GPUs. It supports LLMs, speech-to-text, an…

@HuggingModels: Gemma 4 is here, and it's optimized for Apple Silicon. This 4-bit quantized model runs fast on your Mac, not just in th…

Submit Feedback