Holo-World: Unified Camera, Object and Weather Control for Video World Model

Hugging Face Daily Papers 06/18/26, 12:00 AM Papers

Summary

Holo-World presents a unified controllable video world model that generates videos from a single image with explicit control over camera, object motion, and weather. It introduces a novel dataset and techniques to preserve scene structure while transferring to target weather states.

Video world models are moving toward preserving an observed world under controllable camera and object motion while allowing its environmental state to change. Yet these controls remain isolated, and weather generation typically relies on a source video or reconstructed scene that already specifies future structure. We study a first-frame-anchored source-to-state setting, where the model starts from a single image and follows explicit camera and object controls and an optional weather instruction, then generates a video that either preserves the source world or transfers it to a target weather state. To address these challenges, we first build HoloStateData, a state video dataset that turns diverse videos into unified control samples for camera, object, and weather supervision. Second, we introduce Holo-World, a unified controllable video world model that jointly controls scene from a single image. Its Unified Scene Adapter factorizes world preservation and weather transfer into distinct parameter subspaces, using rendered background, geometry buffers, and object controls to maintain controlled scene structure while modeling weather-dependent appearance and particle effects. Additionally, Scene-Weather Decomposed CFG guides scene and weather residuals separately, strengthening target weather effects without over-amplifying the full condition. Quantitative and qualitative experiments demonstrate that Holo-World maintains precise camera and object control with consistent scene structure while transferring scenes into diverse target weather state, outperforming video-to-video weather editing baselines on weather-state generation. Our project page is available at https://xiangchenyin.github.io/Holo-World/.

Original Article

View Cached Full Text

Cached at: 06/20/26, 02:29 PM

Paper page - Holo-World: Unified Camera, Object and Weather Control for Video World Model

Source: https://huggingface.co/papers/2606.20083

Abstract

A unified controllable video world model generates videos from a single image while preserving scene structure and transferring to target weather states through specialized parameterization and conditioning techniques.

Video world modelsare moving toward preserving an observed world under controllable camera andobject motionwhile allowing its environmental state to change. Yet these controls remain isolated, andweather generationtypically relies on a source video or reconstructed scene that already specifies future structure. We study a first-frame-anchoredsource-to-state setting, where the model starts from a single image and follows explicit camera and object controls and an optional weather instruction, then generates a video that either preserves the source world or transfers it to a target weather state. To address these challenges, we first buildHoloStateData, a state video dataset that turns diverse videos into unified control samples for camera, object, and weather supervision. Second, we introduceHolo-World, a unified controllable video world model that jointly controls scene from a single image. ItsUnified Scene Adapterfactorizes world preservation andweather transferinto distinct parameter subspaces, usingrendered background,geometry buffers, and object controls to maintain controlled scene structure while modeling weather-dependent appearance and particle effects. Additionally,Scene-Weather Decomposed CFGguides scene and weather residuals separately, strengthening target weather effects without over-amplifying the full condition. Quantitative and qualitative experiments demonstrate thatHolo-Worldmaintains precise camera and object control with consistent scene structure while transferring scenes into diverse target weather state, outperformingvideo-to-video weather editingbaselines on weather-state generation. Our project page is available at https://xiangchenyin.github.io/Holo-World/.

View arXiv page View PDF Project page GitHub4 Add to collection

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2606.20083 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2606.20083 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2606.20083 in a Space README.md to link it from this page.

Holo-World: Unified Camera, Object and Weather Control for Video World Model

Paper page - Holo-World: Unified Camera, Object and Weather Control for Video World Model

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper1

Similar Articles

HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds

WorldCraft: From Camera Navigation to Object Manipulation in Interactive Video World Models

MultiWorld: Scalable Multi-Agent Multi-View Video World Models

minWM: A Full-Stack Open-Source Framework for Real-Time Interactive Video World Models

SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer

Submit Feedback

Similar Articles

HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds

WorldCraft: From Camera Navigation to Object Manipulation in Interactive Video World Models

MultiWorld: Scalable Multi-Agent Multi-View Video World Models

minWM: A Full-Stack Open-Source Framework for Real-Time Interactive Video World Models

SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer