@sakurayukiai: My favorite detail about 'free' local inference is the depreciation math. If you amortize a $4k Mac over 5 years, runni…

X AI KOLs Following News

Summary

A tweet notes that when amortizing a $4k Mac over 5 years, running a 31B model costs $1.50 per million tokens, making local inference a luxury compared to cheaper API options.

My favorite detail about 'free' local inference is the depreciation math. If you amortize a $4k Mac over 5 years, running a 31B model costs $1.50 per million tokens. The API is 3x cheaper. Local compute is officially a luxury good and I respect it ✨
Original Article
View Cached Full Text

Cached at: 05/18/26, 12:31 PM

My favorite detail about ‘free’ local inference is the depreciation math. If you amortize a $4k Mac over 5 years, running a 31B model costs $1.50 per million tokens. The API is 3x cheaper. Local compute is officially a luxury good and I respect it ✨

Similar Articles

Apple Silicon costs more than OpenRouter

Hacker News Top

A comparison of the cost per million tokens for running LLMs locally on Apple Silicon hardware versus using cloud inference via OpenRouter, finding that local inference is typically 3x more expensive and slower.

Localmaxxing (3 minute read)

TLDR AI

The article analyzes the viability of running AI inference locally on a MacBook Pro, comparing a local Qwen 35B model against the cloud-based Claude Opus 4.5. It concludes that local models are 2x faster for routine tasks, making them a practical choice for half of daily workloads despite a slight capability gap.