Pantheon360: Taming Digital Twin Generation via 3D-Aware 360° Video Diffusion

Hugging Face Daily Papers 05/25/26, 12:00 AM Papers

Summary

Pantheon360 introduces a 3D-aware 360° video diffusion framework that uses an explicit 3D cache to enforce geometric consistency, enabling high-fidelity digital twin generation from sparse 360° inputs.

Generating complete digital twins from videos requires precise camera control, global scene coverage, and strict spatial-temporal consistency constraints that remain challenging for perspective video generators due to their limited field of view (FoV). Their narrow FoV forces long or multi-view trajectories, amplifying cross-view inconsistency and temporal drift. We argue that 360° video generation offers a natural solution: panoramic coverage simplifies trajectory design and provides a strong global context for maintaining coherence. We introduce Pantheon360: Taming Digital Twin Generation via 3D-Aware 360° Video Diffusion, a controllable 360° video generation framework that synthesizes high-fidelity videos from sparse 360° inputs. The key idea is an explicit 3D Cache, reconstructed from the input, which serves as a geometric scaffold for any user-defined camera path. This allows the diffusion model to focus on photorealistic texture refinement while the 3D Cache enforces global geometric consistency. Experiments show that Pantheon360 achieves superior visual quality and unmatched geometric coherence, enabling reliable and flexible 360° scene generation for downstream simulation and digital-twin applications.

Original Article

View Cached Full Text

Cached at: 05/26/26, 02:41 AM

Paper page - Pantheon360: Taming Digital Twin Generation via 3D-Aware 360° Video Diffusion

Source: https://huggingface.co/papers/2605.25449 Authors:

Abstract

Pantheon360 enables high-fidelity 360° video generation for digital twins by combining 3D-aware diffusion with explicit geometric caching to ensure spatial-temporal consistency.

Generating complete digital twins from videos requires precise camera control, global scene coverage, and strictspatial-temporal consistencyconstraints that remain challenging for perspective video generators due to their limited field of view (FoV). Their narrow FoV forces long or multi-view trajectories, amplifying cross-view inconsistency and temporal drift. We argue that360° video generationoffers a natural solution: panoramic coverage simplifies trajectory design and provides a strong global context for maintaining coherence. We introduce Pantheon360: TamingDigital Twin Generationvia 3D-Aware 360° Video Diffusion, a controllable360° video generationframework that synthesizes high-fidelity videos from sparse 360° inputs. The key idea is an explicit3D Cache, reconstructed from the input, which serves as ageometric scaffoldfor any user-defined camera path. This allows the diffusion model to focus onphotorealistic texture refinementwhile the3D Cacheenforces global geometric consistency. Experiments show that Pantheon360 achieves superior visual quality and unmatched geometric coherence, enabling reliable and flexible 360° scene generation for downstream simulation and digital-twin applications.

View arXiv page View PDF Project page Add to collection

Get this paper in your agent:

hf papers read 2605\.25449

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2605.25449 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2605.25449 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2605.25449 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

Pantheon360: Taming Digital Twin Generation via 3D-Aware 360° Video Diffusion

Paper page - Pantheon360: Taming Digital Twin Generation via 3D-Aware 360° Video Diffusion

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper0

Similar Articles

AnyRecon: Arbitrary-View 3D Reconstruction with Video Diffusion Model

TrackCraft3R: Repurposing Video Diffusion Transformers for Dense 3D Tracking

Helix4D: Complex 4D Mesh Generation

Unified Panoramic Geometry Estimation via Multi-View Foundation Models

Towards Streaming Synchronized Spatial Audio Generation via Autoregressive Diffusion Transformer

Submit Feedback

Similar Articles

AnyRecon: Arbitrary-View 3D Reconstruction with Video Diffusion Model

TrackCraft3R: Repurposing Video Diffusion Transformers for Dense 3D Tracking

Helix4D: Complex 4D Mesh Generation

Unified Panoramic Geometry Estimation via Multi-View Foundation Models

Towards Streaming Synchronized Spatial Audio Generation via Autoregressive Diffusion Transformer