Tag
Ethan He from xAI discusses why video agent models are the next frontier, arguing that video models derive intelligence from LLMs and that the evolution of video generation will mirror AI coding, shifting from one-shot output to multi-turn planning and execution.
Google announces Veo 3 and Imagen 4, next-generation video and image generation models with significant capability improvements including audio generation and enhanced physics simulation. The company also introduces Flow, an AI filmmaking tool, and expands access to Lyria 2 for music creation.
The article introduces updates to generative media products announced at the Google I/O 2026 Keynote, including Google Pics image editing tool, Stitch UI design tool, and new features of Google Flow such as Gemini Omni, multi-agent parallel processing, custom tools, and music remixing. It emphasizes how the technology helps users quickly turn their creative ideas into reality.