Gemma 4 audio with MLX
Summary
A practical guide for audio transcription on macOS using Gemma 4 E2B model with MLX and mlx-vlm, including a uv run recipe and demonstration of the workflow.
View Cached Full Text
Cached at: 04/20/26, 08:27 AM
Similar Articles
New Gemma 4 MTP on MLX?
Google released Multi Token Prediction drafters for Gemma 4 to accelerate inference via speculative decoding, but support for MLX is currently unconfirmed or unavailable.
Jiunsong/supergemma4-26b-uncensored-mlx-4bit-v2
SuperGemma4-26B-Uncensored-MLX-4bit-v2 is a fine-tuned and quantized variant of Google's Gemma 4 26B optimized for Apple Silicon, offering improved performance on code, reasoning, and tool-use tasks while maintaining faster inference speeds compared to the stock baseline.
@ivanfioravanti: Autoresearch from @karpathy in action locally using gemma-4-26b-a4b-it-6bit with oMLX on an M5 Max to train Gemma 4 E2B…
Developer Ivan Fioravanti demonstrates running Andrej Karpathy's autoresearch project locally with a 6-bit quantized Gemma-4-26B model on Apple Silicon, suggesting successful training of Gemma 4 E2B IT variant.
Trials and tribulations fine-tuning & deploying Gemma-4 [P]
An ML team documents practical challenges encountered while fine-tuning and deploying Gemma-4, including incompatibilities with PEFT, SFTTrainer, DeepSpeed ZeRO-3, and lack of runtime LoRA serving support, along with workarounds for each issue.
google/gemma-4-31B-it-assistant
Google DeepMind releases Gemma 4, a family of open-weights multimodal models featuring Multi-Token Prediction (MTP) for up to 2x decoding speedups, supporting text, image, video, and audio with enhanced reasoning and coding capabilities.