Introducing Veo 3.1 and advanced creative capabilities

Google DeepMind Blog Models

Summary

Google introduces Veo 3.1, an upgraded video generation model with richer audio, improved narrative control, and enhanced realism, alongside significant updates to Flow with new editing capabilities including Insert and Remove features, plus audio support across all existing tools.

We're rolling out significant updates to Veo that give people even more creative control.
Original Article
View Cached Full Text

Cached at: 04/20/26, 08:35 AM

# Introducing Veo 3.1 and advanced capabilities in Flow Source: https://blog.google/innovation-and-ai/products/veo-updates-flow/ We're rolling out significant updates to Veo that give people even more creative control. image (1)Thomas Iljic Director of Product Management, Google Labs ## General summary Flow now has enhanced creative tools and supports audio across all features. You can edit clips more precisely. Veo 3.1 brings richer audio, more narrative control and enhanced realism. Summaries were generated by Google AI. Generative AI is experimental. Five months ago, we introduced Flow (http://flow.google/), our AI filmmaking tool powered by Veo (https://deepmind.google/models/veo/), and have been inspired by the creativity it has sparked with over 275 million videos generated in Flow1 (https://blog.google/innovation-and-ai/products/veo-updates-flow/#footnote-1). We're always listening to your feedback, and we've heard that you want more artistic control within Flow, with increased support for audio across all features. Today, we're introducing new and enhanced creative capabilities to edit your clips, giving you more granular control over your final scene. For the first time, we're also bringing audio to existing capabilities like "Ingredients to Video," "Frames to Video" and "Extend." We're also introducing Veo 3.1, which brings richer audio, more narrative control, and enhanced realism that captures true-to-life textures. Veo 3.1 is state-of-the-art (https://deepmind.google/models/veo/evals/) and builds on Veo 3, with stronger prompt adherence and improved audiovisual quality when turning images into videos. ## Refine your narrative with audio and more control With Veo 3.1, we're bringing audio to existing capabilities to help you craft the perfect scene. These features are experimental and actively improving, and we're excited to see what you create as we iterate based on your feedback. Now, with rich, generated audio, you can: - **Craft the look of your scene.** With "Ingredients to Video," you can use multiple reference images to control the characters, objects and style. Flow uses your ingredients to create a final scene that looks just as you envisioned. - **Control the shot from start to finish.** Provide a starting and ending image with "Frames to Video," and Flow will generate a seamless video that bridges the two, perfect for artful and epic transitions. - **Create longer, seamless shots.** With "Extend," you can create longer videos, even lasting for a minute or more, that connect to and continue the action from your original clip. Each video is generated based on the final second of your previous clip, making it most useful for creating a longer establishing shot. ## Edit your ingredients and videos with more precision Great ideas can strike at any point in the creative process. For moments when the first take isn't the final one, we're introducing new editing capabilities directly within Flow to help you reimagine and perfect your scenes. - **Add new elements to any scene.** With "Insert," introduce anything you can imagine, from realistic details to fantastical creatures. Flow now handles complex details like shadows and scene lighting, making the addition look natural. - **Remove unwanted objects or characters seamlessly.** Soon, you'll be able to take anything out of a scene, and Flow will reconstruct the background and surroundings, making it look as though the object was never there. ## Start creating in Flow today With more precise editing capabilities, audio across all existing features and higher-quality outputs powered by Veo 3.1, we're opening up new possibilities for richer, more powerful video storytelling right inside Flow (http://flow.google/). The Veo 3.1 model is also available via the Gemini API (https://ai.google.dev/gemini-api/docs/video?example=dialogue) for developers, Vertex AI (https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/veo-video-generation) for enterprise customers, and the Gemini app (http://gemini.google.com/veo). New capabilities are available in both Gemini API (https://ai.google.dev/gemini-api/docs/video?example=dialogue)2 (https://blog.google/innovation-and-ai/products/veo-updates-flow/#footnote-2) and Vertex AI (https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/veo-video-generation)3 (https://blog.google/innovation-and-ai/products/veo-updates-flow/#footnote-3). ### Related stories

Similar Articles

Fuel your creativity with new generative media models and tools

Google DeepMind Blog

Google announces Veo 3 and Imagen 4, next-generation video and image generation models with significant capability improvements including audio generation and enhanced physics simulation. The company also introduces Flow, an AI filmmaking tool, and expands access to Lyria 2 for music creation.

State-of-the-art video and image generation with Veo 2 and Imagen 3

Google DeepMind Blog

Google announced Veo 2 and Imagen 3, state-of-the-art video and image generation models now available in VideoFX, ImageFX, and a new tool called Whisk. Veo 2 generates high-quality 4K videos with improved physics understanding and cinematography knowledge, while Imagen 3 produces brighter, better-composed images with diverse art styles.

Build with Veo 3.1 Lite, our most cost-effective video generation model

Google AI Blog

Google releases Veo 3.1 Lite, a cost-effective video generation model available on the Gemini API with 50% lower cost than Veo 3.1 Fast while maintaining the same speed. The model supports text-to-video and image-to-video generation with flexible resolutions and aspect ratios.