Tag
The author launches a weekly Video Model Journal Club covering video generation, world models, physical reasoning, diffusion, flow matching, etc. The first in-person talk will be by Yilun Du on Embodied Reasoning with World Models.
LongCat released WBench, a benchmark for video world models that tests control, memory, instruction-following, and physical plausibility across 289 cases and 20 models, finding that no model excels in all dimensions, highlighting the gap between video quality and true world simulation.
A tweet thread reveals that Google's Omni is distinct from video models like Veo and Seedance, with DeepMind's Jeff Dean clarifying its unique input/output capabilities, described as a transformative AR filter for video.