GenRecon: Bridging Generative Priors for Multi-View 3D Scene Reconstruction

Hugging Face Daily Papers 05/22/26, 12:00 AM Papers

Summary

GenRecon introduces a method for 3D scene reconstruction that integrates generative 3D priors with multi-view image conditioning, achieving high-fidelity, editable mesh reconstructions of indoor environments and outperforming existing methods by 16%.

We introduce a new approach to high-fidelity 3D scene reconstruction from multi-view RGB images that tightly couples reconstruction with a strong generative 3D prior. We cast scene reconstruction as conditional 3D generation over a set of spatially-localized, overlapping chunks that together tile the scene, scaling generation to large scene extents. Crucially, we inherit the fidelity and completeness of state-of-the-art generative shape models -- we use Trellis.2 as an example -- which we generalize to the scene level. To this end, we propose a projection-based conditioning mechanism that lifts posed multi-view image features into a coherent 3D representation aligned with the generative model, independent of view ordering and spatially anchored to the scene, yielding high-fidelity, multi-view consistent generated geometry. This enables lifting the strong object-level prior of Trellis.2 to multi-view, scene-scale generation, producing faithful, editable PBR mesh reconstructions of indoor environments. As a result, we obtain high-fidelity results that outperform cutting-edge reconstruction methods by 16%.

Original Article

View Cached Full Text

Cached at: 05/25/26, 02:35 AM

Paper page - GenRecon: Bridging Generative Priors for Multi-View 3D Scene Reconstruction

Source: https://huggingface.co/papers/2605.23888

Abstract

A novel method for 3D scene reconstruction that integrates generative 3D priors with multi-view image conditioning to produce high-fidelity, editable mesh reconstructions of indoor environments.

We introduce a new approach to high-fidelity3D scene reconstructionfrom multi-view RGB images that tightly couples reconstruction with a stronggenerative 3D prior. We cast scene reconstruction asconditional 3D generationover a set of spatially-localized, overlapping chunks that together tile the scene, scaling generation to large scene extents. Crucially, we inherit the fidelity and completeness of state-of-the-art generative shape models -- we useTrellis.2as an example -- which we generalize to the scene level. To this end, we propose aprojection-based conditioning mechanismthat lifts posedmulti-view image featuresinto acoherent 3D representationaligned with the generative model, independent of view ordering and spatially anchored to the scene, yielding high-fidelity, multi-view consistent generated geometry. This enables lifting the strong object-level prior ofTrellis.2to multi-view, scene-scale generation, producing faithful, editablePBR mesh reconstructionsof indoor environments. As a result, we obtain high-fidelity results that outperform cutting-edge reconstruction methods by 16%.

View arXiv page View PDF Project page Add to collection

Get this paper in your agent:

hf papers read 2605\.23888

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2605.23888 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2605.23888 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2605.23888 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

GenRecon: Bridging Generative Priors for Multi-View 3D Scene Reconstruction

Paper page - GenRecon: Bridging Generative Priors for Multi-View 3D Scene Reconstruction

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper0

Similar Articles

AnyRecon: Arbitrary-View 3D Reconstruction with Video Diffusion Model

Geometry-Aware Representation Denoising for Robust Multi-view 3D Reconstruction

Unified Panoramic Geometry Estimation via Multi-View Foundation Models

Sat3DGen: Comprehensive Street-Level 3D Scene Generation from Single Satellite Image

VidSplat: Gaussian Splatting Reconstruction with Geometry-Guided Video Diffusion Priors

Submit Feedback

Similar Articles

AnyRecon: Arbitrary-View 3D Reconstruction with Video Diffusion Model

Geometry-Aware Representation Denoising for Robust Multi-view 3D Reconstruction

Unified Panoramic Geometry Estimation via Multi-View Foundation Models

Sat3DGen: Comprehensive Street-Level 3D Scene Generation from Single Satellite Image

VidSplat: Gaussian Splatting Reconstruction with Geometry-Guided Video Diffusion Priors