@ivanfioravanti: Autoresearch from @karpathy in action locally using gemma-4-26b-a4b-it-6bit with oMLX on an M5 Max to train Gemma 4 E2B…

X AI KOLs Timeline 04/21/26, 09:18 PM Models

Summary

Developer Ivan Fioravanti demonstrates running Andrej Karpathy's autoresearch project locally with a 6-bit quantized Gemma-4-26B model on Apple Silicon, suggesting successful training of Gemma 4 E2B IT variant.

Autoresearch from @karpathy in action locally using gemma-4-26b-a4b-it-6bit with oMLX on an M5 Max to train Gemma 4 E2B IT COULD WORK!

Original Article

View Cached Full Text

Cached at: 04/22/26, 08:22 AM

Autoresearch from @karpathy in action locally using gemma-4-26b-a4b-it-6bit with oMLX on an M5 Max to train Gemma 4 E2B IT COULD WORK!

Similar Articles

@rohanpaul_ai: Gemma 4 (specifically its edge-optimized E2B and E4B variants) running fully offline on an iPhone via apps like Locally…

X AI KOLs Following

Google’s Gemma 4 E2B/E4B quantized variants now run fully offline on iPhone via apps like Locally AI, leveraging the Apple Neural Engine for on-device inference.

Gemma4 26b MoE running in MLX with turboquant (and custom kernel)

Reddit r/LocalLLaMA

A developer successfully ran Gemma4 26b MoE on Apple MacBook Air M5 using MLX with turboquant and a custom kernel, achieving faster prompt processing and generation speeds than llama.cpp with lower memory usage. The implementation includes instructions for local deployment.

@analogalok: I just got Gemma 4 26B A4B MoE model running fully locally with Hermes agent on an 8GB RTX 4060 and it's now backtestin…

X AI KOLs Following

A developer demonstrates running Gemma 4 26B MoE model locally on an 8GB RTX 4060 with Hermes agent to fully automate backtesting of trading strategies, highlighting the growing capability of local LLMs as autonomous agents.

@rohanpaul_ai: So much possibilities for on-device small models. Here @adrgrondin is running Google’s Gemma 4 E2B on iPhone 17 Pro. ~4…

X AI KOLs Following

Google's Gemma 4 E2B is demonstrated running on an iPhone 17 Pro via MLX optimization, achieving ~40 tokens/second with 128K context and offline thinking mode for coding and math.

@HuggingModels: Gemma 4 is here, and it's optimized for Apple Silicon. This 4-bit quantized model runs fast on your Mac, not just in th…

X AI KOLs Timeline

Gemma 4 is a 4-bit quantized model optimized for Apple Silicon, enabling fast local inference on Mac devices, reducing reliance on cloud computing.

Similar Articles

@rohanpaul_ai: Gemma 4 (specifically its edge-optimized E2B and E4B variants) running fully offline on an iPhone via apps like Locally…

Gemma4 26b MoE running in MLX with turboquant (and custom kernel)

@analogalok: I just got Gemma 4 26B A4B MoE model running fully locally with Hermes agent on an 8GB RTX 4060 and it's now backtestin…

@rohanpaul_ai: So much possibilities for on-device small models. Here @adrgrondin is running Google’s Gemma 4 E2B on iPhone 17 Pro. ~4…

@HuggingModels: Gemma 4 is here, and it's optimized for Apple Silicon. This 4-bit quantized model runs fast on your Mac, not just in th…

Submit Feedback