SAM 3D Body: Robust Full-Body Human Mesh Recovery

Papers with Code Trending Papers

Summary

SAM 3D Body is a promptable 3D human mesh recovery model using a novel parametric representation (MHR) and encoder-decoder architecture, achieving state-of-the-art performance with strong generalization. The model supports auxiliary prompts and is open-source.

We introduce SAM 3D Body (3DB), a promptable model for single-image full-body 3D human mesh recovery (HMR) that demonstrates state-of-the-art performance, with strong generalization and consistent accuracy in diverse in-the-wild conditions. 3DB estimates the human pose of the body, feet, and hands. It is the first model to use a new parametric mesh representation, Momentum Human Rig (MHR), which decouples skeletal structure and surface shape. 3DB employs an encoder-decoder architecture and supports auxiliary prompts, including 2D keypoints and masks, enabling user-guided inference similar to the SAM family of models. We derive high-quality annotations from a multi-stage annotation pipeline that uses various combinations of manual keypoint annotation, differentiable optimization, multi-view geometry, and dense keypoint detection. Our data engine efficiently selects and processes data to ensure data diversity, collecting unusual poses and rare imaging conditions. We present a new evaluation dataset organized by pose and appearance categories, enabling nuanced analysis of model behavior. Our experiments demonstrate superior generalization and substantial improvements over prior methods in both qualitative user preference studies and traditional quantitative analysis. Both 3DB and MHR are open-source.
Original Article
View Cached Full Text

Cached at: 06/05/26, 02:06 PM

Paper page - SAM 3D Body: Robust Full-Body Human Mesh Recovery

Source: https://huggingface.co/papers/2602.15989 Authors:

,

,

,

,

,

,

,

,

,

,

,

,

Abstract

A promptable 3D human mesh recovery model using a novel parametric representation and encoder-decoder architecture achieves state-of-the-art performance with strong generalization across diverse conditions.

We introduce SAM 3D Body (3DB), a promptable model for single-image full-body3D human mesh recovery(HMR) that demonstrates state-of-the-art performance, with strong generalization and consistent accuracy in diverse in-the-wild conditions. 3DB estimates the human pose of the body, feet, and hands. It is the first model to use a newparametric mesh representation,Momentum Human Rig(MHR), which decouples skeletal structure and surface shape. 3DB employs anencoder-decoder architectureand supportsauxiliary prompts, including2D keypointsandmasks, enabling user-guided inference similar to the SAM family of models. We derive high-quality annotations from a multi-stage annotation pipeline that uses various combinations of manual keypoint annotation,differentiable optimization,multi-view geometry, anddense keypoint detection. Ourdata engineefficiently selects and processes data to ensure data diversity, collecting unusual poses and rare imaging conditions. We present a newevaluation datasetorganized by pose and appearance categories, enabling nuanced analysis of model behavior. Our experiments demonstrate superior generalization and substantial improvements over prior methods in bothqualitative user preference studiesand traditionalquantitative analysis. Both 3DB and MHR are open-source.

View arXiv pageView PDFProject pageGitHub3.11kAdd to collection

Get this paper in your agent:

hf papers read 2602\.15989

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2602.15989 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2602.15989 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2602.15989 in a Space README.md to link it from this page.

Collections including this paper2

Similar Articles

SAM 3: Segment Anything with Concepts

Papers with Code Trending

SAM 3 introduces a unified model for promptable concept segmentation and tracking, achieving state-of-the-art performance with a decoupled recognition and localization architecture and a scalable data engine.

Geometry Matters: 3D Foundation Priors for Learning Semantic Correspondence

Hugging Face Daily Papers

This paper introduces a post-training framework that leverages 3D priors from SAM3D to improve semantic correspondence in 2D foundation features, addressing issues like left-right confusion and repeated parts. The method uses instance-specific 3D reconstruction without pose annotations or spherical geometry shortcuts.