@DengHokin: I am super excited to share that I launch a weekly Video Model Journal Club. Every week we pick one paper and go deep, …

X AI KOLs Timeline Events

Summary

The author launches a weekly Video Model Journal Club covering video generation, world models, physical reasoning, diffusion, flow matching, etc. The first in-person talk will be by Yilun Du on Embodied Reasoning with World Models.

I am super excited to share that I launch a weekly Video Model Journal Club. Every week we pick one paper and go deep, i.e. video generation, world models, physical reasoning, diffusion, flow matching, and everything in between. This Friday, we will have Yilun Du @du_yilun from @Harvard giving us a talk on Embodied Reasoning with World Models in person at @moonlake - really grateful for Fan-yun Sun @sunfanyun, Charlotte @xia_char and Shin @shinshin_oob for hosting. Register for in-person via Luma: https://luma.com/video-model #video #AI #SF
Original Article
View Cached Full Text

Cached at: 06/16/26, 11:53 AM

I am super excited to share that I launch a weekly Video Model Journal Club. Every week we pick one paper and go deep, i.e. video generation, world models, physical reasoning, diffusion, flow matching, and everything in between.

This Friday, we will have Yilun Du @du_yilun from @Harvard giving us a talk on Embodied Reasoning with World Models in person at @moonlake - really grateful for Fan-yun Sun @sunfanyun, Charlotte @xia_char and Shin @shinshin_oob for hosting.

Register for in-person via Luma: https://luma.com/video-model

#video #AI #SF


Video Model Journal Club · Events Calendar

Source: https://luma.com/video-model Every week we pick one paper and go deep — video generation, world models, physical reasoning, diffusion, flow matching, and everything in between.


Events

Cover Image for Embodied Reasoning with World Models by Yilun Du

Embodied Reasoning with World Models by Yilun Du

By Hokin Deng, Fan-Yun Sun, Charlotte Xia, Shin & 2 others

San Francisco, United States

Cover Image for Think Visually, Reason Textually: Vision-Language Synergy in ARC by Beichen Zhang

Think Visually, Reason Textually: Vision-Language Synergy in ARC by Beichen Zhang

Cover Image for Demystifying Video Reasoning by Ruisi Wang

Demystifying Video Reasoning by Ruisi Wang

Cover Image for Video Reasoning Models by Zhongang Cai

Video Reasoning Models by Zhongang Cai

Cover Image for Video Models Can Reason with Verifiable Rewards by Tinghui Zhu

Video Models Can Reason with Verifiable Rewards by Tinghui Zhu

Cover Image for Video Models Are Zero-Shot Learners and Reasoners by Thaddäus Wiedemer

Video Models Are Zero-Shot Learners and Reasoners by Thaddäus Wiedemer

Cover Image for Do Joint Audio-Video Generation Models Understand Physics? by Zijun Cui

Do Joint Audio-Video Generation Models Understand Physics? by Zijun Cui

Similar Articles

@swyx: full writeup and links here

X AI KOLs Timeline

A Latent Space podcast episode discusses the thesis that video models derive intelligence from LLMs, and that the next frontier is video agents. Guest Ethan He, who built Grok Imagine at xAI, shares insights on building frontier image and video systems.

Qwen's Embodied World Modeling (28 minute read)

TLDR AI

The Qwen-RobotWorld technical report presents a unified language-conditioned video world model for embodied intelligence, enabling future video prediction from current observations across various domains like robotics, autonomous driving, and navigation, with applications in synthetic data generation, policy evaluation, and planning.