@rohanpaul_ai: atomic[.]chat (a desktop app that runs LLMs locally) ran a very revealing comparison for local AI agents, on a MacBook …

X AI KOLs Following 05/30/26, 08:31 PM News

local-llms tool-calling benchmark liquid-model openai macbook-pro ai-agents

Summary

Liquid's LFM2.5-8B-A1B outperformed OpenAI's gpt-oss-20b on a tool-calling benchmark when run locally on a MacBook Pro, completing all required tool calls in half the time while using less memory.

atomic[.]chat (a desktop app that runs LLMs locally) ran a very revealing comparison for local AI agents, on a MacBook Pro M5 Max, 64GB. Liquid’s much smaller LFM2.5-8B-A1B beat gpt-oss-20b by finishing every required tool call, cutting runtime by more than half, and using 4.8GB https://t.co/89GRmfJeJk

Original Article

View Cached Full Text

Cached at: 05/31/26, 06:40 AM

atomic[.]chat (a desktop app that runs LLMs locally) ran a very revealing comparison for local AI agents, on a MacBook Pro M5 Max, 64GB.

Liquid’s much smaller LFM2.5-8B-A1B beat gpt-oss-20b by finishing every required tool call, cutting runtime by more than half, and using 4.8GB https://t.co/89GRmfJeJk

atomic.chat (@atomic_chat_hq): Liquid’s LFM2.5-8B-A1B smashed OpenAI’s gpt-oss-20b on tool calling

We ran both locally on a MacBook Pro M5 Max, 64GB, and gave each the same trip-planning request that only completes if the model fires all 7 tool calls - weather for 3 cities, two currency conversions, an email

Similar Articles

@rohanpaul_ai: atomic[.]chat just made Gemma 4 26B faster inside LLaMA.cpp. making token generation about 40% faster in its MacBook Pr…

X AI KOLs Following

atomic.chat has optimized Gemma 4 26B inference in LLaMA.cpp, achieving ~40% faster token generation on MacBook Pro M5 Max using Multi-Token Prediction (MTP) speculative decoding. This is a notable win for local AI users running desktop apps, coding agents, and private on-device assistants.

@rohanpaul_ai: Another good news for local-LLM from atomic[.]chat, that runs 100% offline on your computer. They just showed MTP (Mult…

X AI KOLs Following

atomic.chat's MTP technique speeds up local LLM inference by drafting multiple tokens and verifying them together, achieving up to 137% speedup on Qwen 27B dense model with zero accuracy loss.

I've created the fastest local AI engine for Apple Silicon. Optimised for agentic use.

Reddit r/LocalLLaMA

The author announces the release of 'lightning-mlx', a local AI engine optimized for Apple Silicon that achieves high token speeds for coding agents and tool-calling workflows.

Localmaxxing (3 minute read)

TLDR AI

The article analyzes the viability of running AI inference locally on a MacBook Pro, comparing a local Qwen 35B model against the cloud-based Claude Opus 4.5. It concludes that local models are 2x faster for routine tasks, making them a practical choice for half of daily workloads despite a slight capability gap.

@DataChaz: MIND BLOWN that an open-source model running on a MacBook can go toe-to-toe with the cloud Spent yesterday in @atomic_c…

X AI KOLs Timeline

A tweet reports that an open-source model (Gemma 4 31B) running locally on a MacBook via AtomicChat can match cloud Gemini 3.5 Flash for generating a playable Mario game in HTML/Canvas, signaling a shrinking cloud moat.