@EthanHe_42: In @latentspacepod podcast, I shared my view on video generation, world models, LLMs, agents, continual learning and wh…

X AI KOLs Following News

Summary

Ethan He shares his insights from a Latent Space podcast, discussing key ideas about video generation, world models, LLMs, agents, continual learning, and the next frontiers in AI.

In @latentspacepod podcast, I shared my view on video generation, world models, LLMs, agents, continual learning and where the next frontier is. 1. Video models get most of their intelligence from language, not from video data. 2. Idea-to-code is fast now. The bottleneck is back to having enough compute to try every idea. 3. Iteration speed beats almost everything else in model development. 4. The next leap won't be a better video model. It'll be a video agent. 5. Diffusion will be the frontend of AGI, the LLM the backend. Generative UI will replace HTML/CSS: user intent straight to pixels. 6. Physical embodiment may become a tool a powerful AI picks up. Robotics may get solved by video-capable LLMs. 7. Continual learning may look like models that manage their own context, and even rewrite their own harness at test time. Thanks @swyx and @vibhuuuus for having me
Original Article
View Cached Full Text

Cached at: 06/02/26, 05:37 PM

In @latentspacepod podcast, I shared my view on video generation, world models, LLMs, agents, continual learning and where the next frontier is.

  1. Video models get most of their intelligence from language, not from video data.
  2. Idea-to-code is fast now. The bottleneck is back to having enough compute to try every idea.
  3. Iteration speed beats almost everything else in model development.
  4. The next leap won’t be a better video model. It’ll be a video agent.
  5. Diffusion will be the frontend of AGI, the LLM the backend. Generative UI will replace HTML/CSS: user intent straight to pixels.
  6. Physical embodiment may become a tool a powerful AI picks up. Robotics may get solved by video-capable LLMs.
  7. Continual learning may look like models that manage their own context, and even rewrite their own harness at test time. Thanks @swyx and @vibhuuuus for having me

Apple podcast: https://podcasts.apple.com/us/podcast/latent-space-the-ai-engineer-podcast/id1674008350?i=1000770600564… Spotify: https://open.spotify.com/episode/1ZUjJ0WBqpp5F2vwZbpVSf… transcript on Substack: https://latent.space/p/video-agents

great interview @EthanHe_42 @latentspacepod

Great listen. Nice pod @EthanHe_42

Similar Articles

@swyx: full writeup and links here

X AI KOLs Timeline

A Latent Space podcast episode discusses the thesis that video models derive intelligence from LLMs, and that the next frontier is video agents. Guest Ethan He, who built Grok Imagine at xAI, shares insights on building frontier image and video systems.