@MaxForAI: Former NVIDIA and Meta researcher, xAI head of world model Ethan He just announced his departure. Many may not know exactly what he was responsible for. He was a core member of Grok Imagine (including video generation) from the 0-to-1 stage. He said when he joined xAI, Grok Imagine had nothing...
Summary
Ethan He, former head of world model at xAI and key contributor to Grok Imagine video generation, announced his departure. He built the multimodal video model from scratch in three months after joining xAI in July 2025.
View Cached Full Text
Cached at: 05/21/26, 06:39 AM
Former NVIDIA and Meta researcher, xAI world model lead Ethan He has just announced his departure.
Many may not know exactly what he was responsible for.
He was a core member of Grok Imagine (including video generation) from 0 to 1.
He himself said that when he joined xAI, Grok Imagine had nothing: no data, no infrastructure, no model.
Three months later, they shipped a multimodal video model, followed by reference-to-video and video extension.
On a broader scale, his work has always been related to media generation, VLM, and world models — not the typical product manager role for video generation.
He completed his undergraduate studies at Xi’an Jiaotong University, earning a Bachelor’s degree in Computer Science and Technology in 2018. He then pursued graduate studies at Carnegie Mellon University (CMU), entering the Master of Science in Computer Vision (MSCV) program at the Robotics Institute in 2018, and completed his degree in December 2019.
After that, Ethan He joined Meta AI (formerly Facebook AI Research, FAIR). His tenure at Meta AI primarily focused on multimodal learning systems and model optimization techniques for practical applications.
Ethan He joined NVIDIA in 2023 as a Staff Engineer, later serving as a Senior Deep Learning Algorithm Engineer, focusing on large-scale deep learning training frameworks, multimodal models, and mixture-of-experts architectures.
Part of He’s work at NVIDIA involved contributing to the development of the Cosmos world foundation model platform, aimed at accelerating the creation of customized world models for physical AI applications such as robotics and autonomous driving.
Ethan He joined xAI in July 2025, bringing his expertise from NVIDIA to contribute to the development of advanced AI models, particularly in video synthesis through the Grok Imagine project.
Grok Imagine v0.9 was released in early October 2025, improving visual quality, motion dynamics, local audio generation, and generation speed (less than 15 seconds per short video).
This multiplied Grok’s user count several times (though it also brought some controversies).
xAI has indeed seen a number of departures recently, and Ethan’s is one of the more notable among them.
Wishing him all the best in his next chapter.
Ethan He (@EthanHe_42): I’ve left xAI. It’s been quite a journey. I joined when xAI was about to build Grok Imagine from 0 to 1 - no data, no infra, no model. Three months later, we shipped our multimodal video model, followed by reference-to-video and video extension. I’m grateful for the opportunity
Similar Articles
Why Video Agent models are next — Ethan He, xAI Grok Imagine (98 minute read)
Ethan He from xAI discusses why video agent models are the next frontier, arguing that video models derive intelligence from LLMs and that the evolution of video generation will mirror AI coding, shifting from one-shot output to multi-turn planning and execution.
@_FORAB: So surreal. Recently, Meta founder Zuckerberg, on the eve of massive layoffs, had an internal public talk that was recorded by an employee. He revealed that all employees' devices are routinely tracked by the company to train AI models. The reason is to turn everyone's working methods into training data, so that fewer human workers are needed. This is too strang…
Meta founder Zuckerberg revealed in an internal talk that the company routinely tracks employee device data to train AI models, aiming to turn working methods into training data to reduce labor needs. The talk was recorded and exposed by an employee on the eve of layoffs.
@Andrew__Brown__: So, today is my last day at Meta... After I finished my PHD, I moved from Oxford to New York to join FAIR and work on g…
Andrew Brown 宣布离开 Meta,加入 Amilabs,并回顾了他在 Meta 参与的视频生成研究,包括 Emu Video 和 Movie Gen,强调了 Transformer 和规模扩展在视频生成中的重要性。
@swyx: full writeup and links here
A Latent Space podcast episode discusses the thesis that video models derive intelligence from LLMs, and that the next frontier is video agents. Guest Ethan He, who built Grok Imagine at xAI, shares insights on building frontier image and video systems.
@swyx: This pod was an incredible gift to the community: not only our first pod about @xAI, but Ethan really indulged on all o…
A tweet praising a podcast episode where former xAI world model lead Ethan He provides deep insights into training SOTA video generation world models, covering Grok Imagine, Cosmos, and the parallels between video and coding agents.