@HuggingPapers: Geometric Action Model for Robot Policy Learning Repurposes a geometric foundation model as one backbone for perception…

X AI KOLs Following 06/16/26, 01:49 PM Papers

robot-learning policy-learning geometric-model foundation-model inference-speed libero

Summary

Geometric Action Model repurposes a geometric foundation model for robot policy learning, achieving 85.5% on LIBERO-Plus with 6.9 ms inference, 55× faster than baselines.

Geometric Action Model for Robot Policy Learning Repurposes a geometric foundation model as one backbone for perception, prediction, and action. 1.4B parameters. 6.9 ms inference. 85.5% on LIBERO-Plus. 55× faster than baselines. https://t.co/wNYlFaghX0

Original Article

View Cached Full Text

Cached at: 06/17/26, 11:52 AM

Geometric Action Model for Robot Policy Learning

Repurposes a geometric foundation model as one backbone for perception, prediction, and action.

1.4B parameters. 6.9 ms inference. 85.5% on LIBERO-Plus. 55× faster than baselines. https://t.co/wNYlFaghX0

Similar Articles

Geometric Action Model for Robot Policy Learning

Hugging Face Daily Papers

The Geometric Action Model (GAM) repurposes a pretrained geometric foundation model (GFM) as a unified backbone for language-conditioned robot manipulation, achieving higher accuracy, robustness, and efficiency than existing foundation-model-scale baselines across simulation and real-world benchmarks.

PoLAR: Factorizing Extent and Mode in Latent Actions for Robot Policy Learning

Hugging Face Daily Papers

PoLAR introduces a geometrically structured latent action representation in hyperbolic space that separates transition extent from mode, improving robotic policy learning performance.

LaWAM: Latent World Action Models for Efficient Dynamics-Aware Robot Policies

Hugging Face Daily Papers

LaWAM enables efficient robot control by predicting compact latent visual subgoals instead of expensive video generation, achieving state-of-the-art success rates with up to 24x lower latency than pixel-space world action models.

@artemZholus: thanks! in the second paper (https://arxiv.org/abs/2605.06388) we used your (and RAE's) recipe and it worked.

X AI KOLs Following

This paper systematically compares reconstruction-based and semantic latent spaces for action-conditioned latent diffusion world models in robotics. It finds that semantic encoders like V-JEPA 2.1 generally outperform reconstruction encoders on policy-relevant metrics, advocating for semantic latent spaces as a stronger foundation for robotics world models.

Revisiting Articulated Parts Perception in Robot Manipulation

Hugging Face Daily Papers

This paper introduces Geometric Primary Structure (GPS), a new representation for articulated parts perception in robot manipulation, enabling efficient VR-based annotation and achieving a 73% success rate without fine-tuning.

Similar Articles

Geometric Action Model for Robot Policy Learning

PoLAR: Factorizing Extent and Mode in Latent Actions for Robot Policy Learning

LaWAM: Latent World Action Models for Efficient Dynamics-Aware Robot Policies

@artemZholus: thanks! in the second paper (https://arxiv.org/abs/2605.06388) we used your (and RAE's) recipe and it worked.

Revisiting Articulated Parts Perception in Robot Manipulation

Submit Feedback