Tag
StressDream enhances video world models by steering diffusion-based imaginations toward high-impact yet plausible outcomes through optimized noise initialization with semantic and plausibility objectives, enabling robust policy evaluation and improvement.
minWM is a full-stack open-source framework that converts bidirectional video diffusion models into real-time interactive video world models with controllable camera, low-latency rollout, and modular architecture.
Incantation presents an interactive video world model that uses natural language as the action interface for fine-grained multi-entity control and cross-entity generalization, achieving high performance and real-time streaming through novel attention and distillation techniques.
MultiWorld is a unified framework for multi-agent multi-view video world modeling that achieves accurate control of multiple agents while maintaining multi-view consistency through a Multi-Agent Condition Module and Global State Encoder.