World Tracing: Generative Pixel-Aligned Geometry Beyond the Visible

Hugging Face Daily Papers 06/11/26, 12:00 AM Papers

3d-geometry pixel-aligned diffusion-transformer image-to-3d occluded-surfaces flow-matching 3d-reconstruction

Summary

World Tracing introduces a generative pixel-aligned geometry representation that predicts 3D points aligned with observed pixels while completing occluded surfaces. It uses a diffusion transformer trained with pixel-space flow matching, achieving strong performance on visible-surface reconstruction and complete geometry generation across object, scene, and dynamic benchmarks.

Image-to-3D methods often trade off faithfulness and completeness: depth estimators are anchored to input pixels but stop at the visible surface, while image-to-3D models generate complete shapes that are often misaligned with the input. We introduce World Tracing, a generative pixel-aligned geometry representation that predicts 3D points aligned with observed pixels while completing geometry beyond the visible surface. For each input pixel, World Tracing predicts an ordered stack of camera-space 3D points, where the first layer represents the visible surface and subsequent layers represent front-to-back intersections with occluded surfaces. We instantiate this representation with a world-tracing diffusion transformer, WT-DiT, which treats multiple geometry layers as separate denoising tokens coupled through factorized and global attention. WT-DiT is trained with pixel-space flow matching and a mixed noise schedule that balances visible-surface reconstruction with occluded-geometry generation. World Tracing achieves strong performance on visible-surface reconstruction and complete geometry generation across object, scene, and dynamic benchmarks, outperforming both depth predictors and image-to-3D generators. It also preserves 2D-to-3D correspondence, enabling text-driven 3D scene editing, geometry-conditioned novel-view video synthesis, and training-free integration with textured-mesh generators.

Original Article

View Cached Full Text

Cached at: 06/15/26, 04:59 PM

Paper page - World Tracing: Generative Pixel-Aligned Geometry Beyond the Visible

Source: https://huggingface.co/papers/2606.13652

Abstract

World Tracing introduces a generative pixel-aligned geometry representation that predicts 3D points aligned with input pixels while completing hidden surfaces, using a diffusion transformer trained with pixel-space flow matching.

Image-to-3D methods often trade off faithfulness and completeness:depth estimatorsare anchored to input pixels but stop at the visible surface, whileimage-to-3D modelsgenerate complete shapes that are often misaligned with the input. We introduceWorld Tracing, a generative pixel-aligned geometry representation that predicts 3D points aligned with observed pixels while completing geometry beyond the visible surface. For each input pixel,World Tracingpredicts an ordered stack ofcamera-space 3D points, where the first layer represents the visible surface and subsequent layers represent front-to-back intersections with occluded surfaces. We instantiate this representation with a world-tracingdiffusion transformer,WT-DiT, which treats multiple geometry layers as separatedenoising tokenscoupled through factorized andglobal attention.WT-DiTis trained withpixel-space flow matchingand a mixed noise schedule that balances visible-surface reconstruction with occluded-geometry generation.World Tracingachieves strong performance on visible-surface reconstruction and complete geometry generation across object, scene, and dynamic benchmarks, outperforming both depth predictors and image-to-3D generators. It also preserves2D-to-3D correspondence, enablingtext-driven 3D scene editing, geometry-conditionednovel-view video synthesis, and training-free integration withtextured-mesh generators.

View arXiv page View PDF Project page GitHub186 Add to collection

Get this paper in your agent:

hf papers read 2606\.13652

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper4

#### haoz19/object-model-6layer Image-to-3D• Updated4 days ago • 8 #### haoz19/scene-model-6layer Image-to-3D• Updated4 days ago • 5 #### haoz19/scene-model-6layer-840 Image-to-3D• Updated1 day ago • 4 #### haoz19/dynamic-model-16frame Image-to-3D• Updated4 days ago • 3

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2606.13652 in a dataset README.md to link it from this page.

World Tracing: Generative Pixel-Aligned Geometry Beyond the Visible

Paper page - World Tracing: Generative Pixel-Aligned Geometry Beyond the Visible

Abstract

Models citing this paper4

Datasets citing this paper0

Spaces citing this paper1

Collections including this paper1

Similar Articles

Pixal3D: Pixel-Aligned 3D Generation from Images

SurGe: Improved Surface Geometry in Point Maps

World Machine: Towards Generative World Modeling for Time-Series

Geometry-Aware Image Flow Matching

TriSplat: Simulation-Ready Feed-Forward 3D Scene Reconstruction

Submit Feedback

Similar Articles

Pixal3D: Pixel-Aligned 3D Generation from Images

SurGe: Improved Surface Geometry in Point Maps

World Machine: Towards Generative World Modeling for Time-Series

Geometry-Aware Image Flow Matching

TriSplat: Simulation-Ready Feed-Forward 3D Scene Reconstruction