Google Gemma 4 12B
Summary
Google's Gemma 4 12B model enables local multimodal AI using an encoder-free architecture.
Similar Articles
@googleaidevs: We’re launching Gemma 4 12B: Our unified, encoder-free model that brings powerful multimodal intelligence straight to y…
Google launches Gemma 4 12B, an encoder-free multimodal model with native audio support, optimized for local execution on laptops under Apache 2.0.
google/gemma-4-31B-it-assistant
Google DeepMind releases Gemma 4, a family of open-weights multimodal models featuring Multi-Token Prediction (MTP) for up to 2x decoding speedups, supporting text, image, video, and audio with enhanced reasoning and coding capabilities.
Gemma 2B multimodal model matches larger models without encoder
Google's Gemma 4 12B introduces an encoder-free multimodal architecture that competes with larger models, though benchmark comparisons show it trailing Qwen 2.5 9B on most tasks. The article also covers related developments including open-weight model security risks, Uber's Claude Code spending caps, and NeurIPS's misuse of an uncalibrated AI detector.
google/gemma-4-E4B-it-assistant
Google DeepMind releases the Gemma 4 E4B instruction-tuned assistant model, featuring multimodal capabilities, reasoning improvements, and optimized speculative decoding for low-latency on-device applications.
google/gemma-4-26B-A4B-it
Google DeepMind releases Gemma 4, a family of open-weight multimodal models ranging from 2.3B to 31B parameters with support for text, image, video, and audio inputs. The models feature 256K context windows, MoE and dense architectures, enhanced reasoning capabilities, and are optimized for deployment across devices from mobile to servers.