@googledevs: Autonomous AI in action. Check out how the new Gemma 4 31B model operates as an ADK Agent, exploring, planning, and run…

X AI KOLs Following Models

Summary

Google DeepMind released the Gemma 4 series of open-weight models, covering four sizes from 2B to 31B, supporting 128K–256K context, reasoning, and function calling, under Apache 2.0 license, and equipped with ADK framework for autonomous agent capabilities.

Autonomous AI in action. Check out how the new Gemma 4 31B model operates as an ADK Agent, exploring, planning, and running experiments on an unfamiliar database to optimize services and maximize revenue. Dive into the full, inspiring session here to explore the complete workflow: https://goo.gle/4ozP8sK
Original Article
View Cached Full Text

Cached at: 06/18/26, 08:11 PM

Autonomous AI in action.

Check out how the new Gemma 4 31B model operates as an ADK Agent, exploring, planning, and running experiments on an unfamiliar database to optimize services and maximize revenue.

Dive into the full, inspiring session here to explore the complete workflow: https://goo.gle/4ozP8sK


Gemma 4 Officially Launches: Open Weights, Multiple Scales, Agent-Ready – 31B Model Can Explore and Plan Autonomously as ADK Agent

TL;DR: Google DeepMind unveils the next generation of open-weight models, Gemma 4, covering four sizes from 2B to 31B, supporting 128K–256K context, reasoning, and function calling under the Apache 2.0 license. It runs on edge devices, phones, laptops, and the cloud, and pairs with the ADK framework for autonomous agents.

Gemma Family Philosophy & Gemma 4 Overview

Gemma is Google DeepMind’s open-weight large language model, launched in 2024. The core idea: create a model that runs anywhere, can be fully fine-tuned to user needs, while staying size‑sensible and efficient. Gemma 3 offered a family from 1B to 27B, while Gemma 4 is the most capable open model to date, available in four sizes:

  • 2 billion parameters – ideal for IoT devices
  • 4 billion parameters – for high‑performance phones or low‑end laptops
  • 26 billion parameters – mixture‑of‑experts (MoE) model, maximising efficiency and minimising latency
  • 31 billion parameters – dense model, easy to fine‑tune, targeting highest quality

All models prioritise efficiency, scoring comparably on the LLM Arena to models 20× larger. Downloads have surpassed 100 million, and AI Edge Gallery counts over 6 million.

Significant Capability Improvements: Larger Context, Reasoning, Function Calling & Open Apache 2.0

Compared to Gemma 3, key upgrades in Gemma 4:

  • Context window: small models grow from 32K tokens to 128K; 31B and 26B models rise to 256K.
  • Reasoning & function calling: all models simultaneously support thinking (reasoning) and function calling, ready for the agent era.
  • License: switched from the custom Gemma license to Apache 2.0, more flexible for production deployment.

Vision, Multimodal & Agent Capabilities

  • Vision: supports variable‑aspect‑ratio image understanding, better handling of charts, documents, and screenshots; built‑in multimodal translation; new object detection (bounding‑box annotation) suitable for IoT and robotics.
  • Agent: natively supports multi‑step planning, tool use, and autonomous task completion, integrating seamlessly into any agent pipeline.
  • Audio: audio understanding on the smallest models has been redesigned, supporting multiple languages, speech understanding, transcription, and translation.
  • Text: maintains leadership in English, with significantly improved internationalisation.

Multilingual & Benchmark Performance

  • EuroEval (independent European language benchmark): all Gemma models score excellent; the 31B model ranks 1st–5th in nearly every European language (covering both open and closed models).
  • Japanese comparison: very close to GPT‑5.4. Southeast Asian languages and Korean are also strong areas.
  • FoodTruck Bench (reasoning + function calling): the 31B model is called “a beast”, competing against DeepSeek v4 Pro (over 1 trillion parameters) and multiple top closed‑source models.
  • Efficiency: MTP Drafter enables speculative decoding, boosting decoding speed by up to 3×.

Deployment Options: Full Support from Local to Cloud

Gemma is fully open‑weight and compatible with the open‑source ecosystem. Olivier stressed: “You can fine‑tune it however you need.”

Local & Mobile

  • Edge devices: layer‑by‑layer embedding techniques optimise efficiency.
  • Laptops: choose the 31B dense model or the 26B MoE.
  • Android: Day 0 support for Gemma 4 via the Android API – run small models directly on‑phone, or even use the 26B model to write apps locally without API access.

Deployment on Google Cloud

Gus Martins introduced three tiers:

  1. Cloud Run (simplest)
    Deploy with two lines of code, auto‑scale to zero or hundreds of GPUs. Shuts down when not in use; restart takes only seconds of warm‑up.

  2. Gemini Enterprise Agent Platform (formerly Vertex) (medium control)

    • Model Garden: one‑click endpoint deployment, choose GPU (H100, RTX 6000, etc.) and model variant.
    • Model as a Service: no need to build your own endpoint – call the Gemma 26B API directly, pay per token.
    • Supports fine‑tuning (post‑training, reinforcement learning) using the same tools.
  3. Google Kubernetes Engine (GKE) (highest control)
    Full control over VMs, GPUs, TPUs, etc., with pre‑optimised configuration recipes that can be customised. MTP models are available – suitable for advanced users and agent workloads.

Live Demo: Gemma 31B as ADK Agent Autonomously Optimises Revenue

The demo shows Gemma 31B running on Cloud Run with the ADK (Agent Development Kit) framework, connected to a BigQuery MCP server (simulating a Citi Bike database). The query: “Help me optimise revenue.”

The model’s behaviour (real‑time, not sped up):

  1. First, it makes a plan: “Let me see what this database has.”
  2. It automatically queries the database structure (station locations, demand, usage times, etc.).
  3. Based on the data, it autonomously designs a strategy to optimise revenue.

Throughout the process, the model continuously explores, plans, and executes queries, demonstrating autonomous agent capability. Gus emphasised: “Everything you see is at real speed.”

Conclusion

With open weights, multiple scales, high efficiency, and multimodal capabilities, Gemma 4 provides flexible options for developers from edge to cloud. Combined with the Apache 2.0 license, native agent support, and the ADK framework, it becomes a powerful tool for building autonomous AI applications.

Source: YouTube – Autonomous AI in action. Check out how the new Gemma 4 31B model operates as an ADK Agent, exploring, planning, and run… (https://www.youtube.com/watch?v=oUtiZbrehrw)

Similar Articles

Gemma 4: Byte for byte, the most capable open models

Google DeepMind Blog

Google DeepMind introduces Gemma 4, its most capable family of open models to date, designed for advanced reasoning and agentic workflows with high intelligence-per-parameter efficiency across multiple sizes.

google/gemma-4-26B-A4B-it

Hugging Face Models Trending

Google DeepMind releases Gemma 4, a family of open-weight multimodal models ranging from 2.3B to 31B parameters with support for text, image, video, and audio inputs. The models feature 256K context windows, MoE and dense architectures, enhanced reasoning capabilities, and are optimized for deployment across devices from mobile to servers.

google/gemma-4-31B-it-assistant

Hugging Face Models Trending

Google DeepMind releases Gemma 4, a family of open-weights multimodal models featuring Multi-Token Prediction (MTP) for up to 2x decoding speedups, supporting text, image, video, and audio with enhanced reasoning and coding capabilities.

google/gemma-4-E4B-it-assistant

Hugging Face Models Trending

Google DeepMind releases the Gemma 4 E4B instruction-tuned assistant model, featuring multimodal capabilities, reasoning improvements, and optimized speculative decoding for low-latency on-device applications.