@ivanfioravanti: Interesting video of M5 Max, on impact of Low, Automatic and High power modes on inference. - No external monitor attac…

X AI KOLs Timeline News

Summary

A performance test demonstrates the impact of Low, Automatic, and High power modes on LLM inference speed on an M5 Max MacBook, showing significant differences in token generation rates and power consumption.

Interesting video of M5 Max, on impact of Low, Automatic and High power modes on inference. - No external monitor attached - Model not relevant, but it's DS4 Flash Q2. Results: - Low ~25W ~12 toks/s - High ~120W ~ 32 toks/s - Automatic varies from 40W ~14 to 90W ~29 in relation to the fan speed and temperature of the Mac. If you really want to push your MacBook to the max, High Power mode and no external monitors, with them I see a very strange behavior that I'm investigating
Original Article

Similar Articles

Localmaxxing (3 minute read)

TLDR AI

The article analyzes the viability of running AI inference locally on a MacBook Pro, comparing a local Qwen 35B model against the cloud-based Claude Opus 4.5. It concludes that local models are 2x faster for routine tasks, making them a practical choice for half of daily workloads despite a slight capability gap.