Tag
Ethan He from xAI discusses why video agent models are the next frontier, arguing that video models derive intelligence from LLMs and that the evolution of video generation will mirror AI coding, shifting from one-shot output to multi-turn planning and execution.
A Latent Space podcast episode discusses the thesis that video models derive intelligence from LLMs, and that the next frontier is video agents. Guest Ethan He, who built Grok Imagine at xAI, shares insights on building frontier image and video systems.
A tweet praising a podcast episode where former xAI world model lead Ethan He provides deep insights into training SOTA video generation world models, covering Grok Imagine, Cosmos, and the parallels between video and coding agents.
The article criticizes Arena.ai for allegedly running dishonest benchmarks, claiming it ranked GPT 5.5 below Meta's Muse Spark in coding and Grok Imagine above Seedance in video generation, which the author asserts is objectively false.
Ethan He, former head of world model at xAI and key contributor to Grok Imagine video generation, announced his departure. He built the multimodal video model from scratch in three months after joining xAI in July 2025.