Tag
An individual has created a tool using Claude Code to generate true interactive 3D property tours from existing image captures, aiming to replace expensive hardware and subscriptions like Matterport.
This paper introduces TT4D, a novel pipeline and large-scale dataset for reconstructing table tennis gameplay in 4D from monocular videos. It features a unique lift-first approach that estimates 3D ball trajectories and spin before time segmentation, enabling robust reconstruction even with occlusions.
AnyRecon proposes a scalable framework for 3D reconstruction from arbitrary sparse inputs using a video diffusion model with persistent scene memory and geometry-aware conditioning.
LingBot-Map is an AI model capable of converting real-time video streams into real-time 3D reconstruction, running at 20 FPS with complete code and model provided.
LingBot-Map is a feed-forward 3D foundation model for streaming 3D reconstruction that uses a Geometric Context Transformer architecture, achieving state-of-the-art performance with efficient ~20 FPS inference on long sequences exceeding 10,000 frames.
GlobalSplat introduces an efficient feed-forward framework for 3D Gaussian splatting that achieves compact and consistent scene reconstruction using global scene tokens, reducing computational overhead and inference time to under 78ms. The method uses a coarse-to-fine training approach to prevent representation bloat while maintaining competitive novel-view synthesis performance with significantly fewer Gaussians (16K) compared to dense baselines.
Introduces LingBot-Map, a feed-forward 3D foundation model for streaming 3D reconstruction using a geometric context transformer architecture that achieves stable real-time performance at 20 FPS.
HY-World 2.0 is Tencent's open-source multi-modal 3D world model that reconstructs and generates 3D worlds from text, images, and videos, producing editable 3D assets (meshes/Gaussian Splatting) comparable to closed-source methods.
MIT researchers have developed a generative AI-enhanced wireless vision system that reconstructs hidden objects and entire room scenes using millimeter-wave signals, overcoming previous limitations in shape reconstruction and enabling applications in warehouse robotics and smart homes.