@GoogleDeepMind: Gemini 3.1 Flash TTS is our most controllable text-to-speech model yet. With new Audio Tags, you can easily direct voca…

X AI KOLs Models

Summary

Google DeepMind releases Gemini 3.1 Flash TTS, an advanced text-to-speech model featuring new Audio Tags that enable fine-grained control over vocal style, delivery, and pace through text commands.

Gemini 3.1 Flash TTS is our most controllable text-to-speech model yet. With new Audio Tags, you can easily direct vocal style, delivery, and pace through text commands.
Original Article Export to Word Export to PDF
View Cached Full Text

Cached at: 04/20/26, 09:39 AM

Gemini 3.1 Flash TTS is our most controllable text-to-speech model yet. With new Audio Tags, you can easily direct vocal style, delivery, and pace through text commands.

Similar Articles

Gemini 3.1 Flash TTS

Simon Willison's Blog

Google released Gemini 3.1 Flash TTS, a new text-to-speech model accessible via the Gemini API that supports advanced prompt-based control for detailed voice direction, accents, and speaking styles. The model enables sophisticated audio generation including multi-speaker conversations and character-specific vocal performances.

Improved Gemini audio models for powerful voice experiences

Google DeepMind Blog

Google has updated Gemini 2.5 Flash Native Audio to improve live voice agent capabilities, including sharper function calling, better instruction following, and smoother conversation context retrieval. The update also introduces live speech translation in the Google Translate app beta, preserving intonation across 70+ languages.

Introducing Gemini 2.0: our new AI model for the agentic era

Google DeepMind Blog

Google DeepMind introduces Gemini 2.0, a new agentic AI model with native image and audio output, enhanced tool use, and multimodal capabilities designed for the next era of AI agents. Gemini 2.0 Flash is now available to developers with wider availability planned for early 2025.

Gemini 2.5: Our most intelligent models are getting even better

Google DeepMind Blog

Google announces Gemini 2.5 series updates, including improved 2.5 Pro and Flash models with new capabilities like Deep Think (enhanced reasoning mode), native audio output, and computer use abilities via Project Mariner. The models now lead on WebDev Arena and LMArena leaderboards.