Tag
Google's Gemma 4 12B model, released last week, has already surpassed 4 million downloads on HuggingFace, making it the most popular encoder-free VLM and the first general-purpose LLM with encoder-free audio input. The model balances size and performance, enabling local laptop use with multi-step reasoning and agentic workflows.
Discussion comparing Gemma4 12b and 26a4b variants, focusing on creative tasks like writing and chatting.
The author shares their experience switching from Qwen 3.6 to Gemma 4 12B (Unsloth Q5_K_XL) for local coding, praising its plug-and-play setup, better syntax accuracy, and manageable VRAM usage despite a slight speed trade-off.
The first finetuned variants of the Gemma 4 12B model are now available on Hugging Face, offered in GGUF format by multiple developers.
Gemma 4 12b models are now available on Ollama, offering various quantized versions for local AI inference.
Google released Gemma 4 12B, an open-source multimodal AI model under Apache 2.0 that runs locally on laptops with 16GB RAM, targeting enterprise edge deployment.
Google's Gemma 4 12B model enables local multimodal AI using an encoder-free architecture.
JetBrains introduces Mellum2, a 12B parameter Mixture-of-Experts model optimized for code generation and reasoning tasks, with a focus on private deployment and integration into development workflows.