Adaptive Volumetric Mechanical Property Fields Invariant to Resolution
Summary
AdaVoMP uses a sparse adaptive voxel structure and transformer encoder-decoder to predict spatially-varying mechanical properties for 3D objects, enabling high-resolution deformable simulations with improved accuracy and efficiency.
View Cached Full Text
Cached at: 06/20/26, 02:29 PM
Paper page - Adaptive Volumetric Mechanical Property Fields Invariant to Resolution
Source: https://huggingface.co/papers/2606.18231
Abstract
AdaVoMP predicts dense spatially-varying mechanical properties for 3D objects using a sparse adaptive voxel structure and transformer encoder-decoder model, enabling realistic deformable simulations with improved accuracy and efficiency.
Accuratemechanical properties(or materials)Young’s modulus(E),Poisson’s ratio(ν) anddensity(ρ) are essential for reliable physics simulation of digital worlds, but most 3D assets lack this information. We propose AdaVoMP, a method for predicting accurate dense spatially-varying (E, ν, ρ) for input 3D objects across representations, improving the resolution, accuracy, and memory efficiency over the state-of-the-art. The foundation of our technique is a sparse and adaptivevoxel structureSAV that efficiently represents both the input 3D shape and the material field output. We replace the fixed-voxel model of the most accurate prior method, VoMP, with a novelsparse transformer encoder-decodermodel that learns to generate a unique SAV autoregressively for every input shape to represent its materials, achieving a resolution 16^3times higher than prior art. Experiments show that AdaVoMP estimates more accurate volumetric properties, even with lesser test-time compute than all prior art. This allows us to convert high-resolution complex 3D objects into simulation-ready assets, resulting in realisticdeformable simulations.
View arXiv pageView PDFAdd to collection
Get this paper in your agent:
hf papers read 2606\.18231
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2606.18231 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2606.18231 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2606.18231 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
SpatialAvatar-0: High-Quality 4D Head Avatar with Multi-Stage Reconstruction
SpatialAvatar-0 introduces a multi-stage reconstruction method for high-quality 4D head avatars using a shared FLAME-mesh-bound Gaussian representation, achieving superior performance across benchmarks with reduced iterations.
Stream3D-VLM: Online 3D Spatial Understanding with Incremental Geometry Priors
Stream3D-VLM is an online 3D vision-language model that enables real-time spatial understanding from streaming video by incrementally integrating geometry priors and using geometry-adaptive voxel compression, outperforming existing models on 3D spatial understanding tasks.
MeshWeaver: Sparse-Voxel-Guided Surface Weaving for Autoregressive Mesh Generation
MeshWeaver presents an autoregressive mesh generation framework that directly predicts vertices using a multi-level sparse-voxel encoder, achieving state-of-the-art compression and geometric fidelity for high-poly meshes.
Patch-PODiff-ViT: Structured Latent Diffusion with Patchwise POD for Super-Resolution and Uncertainty Quantification
Patch-PODiff-ViT introduces a structured latent diffusion framework using patchwise Proper Orthogonal Decomposition (POD) for super-resolution and uncertainty quantification, enabling efficient diffusion with a fixed linear orthonormal basis and analytic propagation of predictive variance.
AdaCodec: A Predictive Visual Code for Video MLLMs
AdaCodec reduces video encoding redundancy in multimodal LLMs by transmitting full visual tokens only when scene prediction fails, otherwise using compact inter-frame change descriptions. It outperforms per-frame RGB baselines at matched token budgets and achieves better or comparable results with significantly fewer tokens, reducing time-to-first-token from 9.26s to 1.62s.