Tag
Google's Gemma 4 12B introduces an encoder-free multimodal architecture that competes with larger models, though benchmark comparisons show it trailing Qwen 2.5 9B on most tasks. The article also covers related developments including open-weight model security risks, Uber's Claude Code spending caps, and NeurIPS's misuse of an uncalibrated AI detector.
Google's new Gemma 4 12B is a single decoder-only transformer with encoder-free multimodal input, achieving strong benchmarks while being small enough to run locally on a budget GPU. It is released under Apache 2.0 license.
Google DeepMind researcher announces the release of Gemma 4 12B, a dense encoder-free model that processes text, image, and audio inputs, continuing work on unifying models across modalities.
Google's Gemma 4 12B model enables local multimodal AI using an encoder-free architecture.
Google launches Gemma 4 12B, an encoder-free multimodal model with native audio support, optimized for local execution on laptops under Apache 2.0.