@_philschmid: Gemini Interactions API Update As we move beyond simple prompts, strict "user" and "model" roles felt limiting. That's …

X AI KOLs Following Products

Summary

Google is updating the Gemini Interactions API to replace strict user/model roles with a flexible step-based system (outputs + roles → steps), introducing agentic steps like user_input, thought, function_call, tool_call, and model_output. The update also consolidates response_format controls and requires SDK upgrades (Python/JS ≥2.0.0) or a new API header to opt-in.

Gemini Interactions API Update As we move beyond simple prompts, strict "user" and "model" roles felt limiting. That's why we're evolving the Gemini Interactions API to support rich, multi-domain agentic steps. What’s changing? `outputs` + `roles` → `steps`, every action (`user_input`, `thought`, `function_call`, `tool_call`, `model_output` etc.) is its own step, no more `user`/`model` roles. Toggle on every Gemini API Documentation to switch between Interactions API and `generateContent`. Consolidated `response_format` controls (aspect ratios, file formats, etc.). Updated Interactions API skill to make migration and updates seamless. Upgrade your SDKs (Python ≥2.0.0 / JS ≥2.0.0) or add the `Api-Revision: 2026-05-26` header to opt-in. We are in the final steps before GA! If you have feedback, spot a bug or a see docs issue? Let us know! We're listening and making changes. Full Guide and Agent skill below
Original Article

Similar Articles

Introducing the Gemini 2.5 Computer Use model

Google DeepMind Blog

Google releases Gemini 2.5 Computer Use model via the Gemini API, enabling developers to build AI agents that can interact with user interfaces through clicking, typing, and scrolling. The model outperforms alternatives on web and mobile control benchmarks with lower latency and is available in preview on Google AI Studio and Vertex AI.

Introducing Gemini 2.0: our new AI model for the agentic era

Google DeepMind Blog

Google DeepMind introduces Gemini 2.0, a new agentic AI model with native image and audio output, enhanced tool use, and multimodal capabilities designed for the next era of AI agents. Gemini 2.0 Flash is now available to developers with wider availability planned for early 2025.

Start building with Gemini 3

Google DeepMind Blog

Google has launched Gemini 3 Pro, a new AI model designed to outperform previous versions in coding, agentic workflows, and multimodal reasoning. The model is available via the Gemini API, Google AI Studio, and the new Google Antigravity development platform.

Improved Gemini audio models for powerful voice experiences

Google DeepMind Blog

Google has updated Gemini 2.5 Flash Native Audio to improve live voice agent capabilities, including sharper function calling, better instruction following, and smoother conversation context retrieval. The update also introduces live speech translation in the Google Translate app beta, preserving intonation across 70+ languages.