MolmoAct 2
Summary
MolmoAct 2 is an open robotics model that reasons in 3D space before taking actions, developed by the Allen Institute for Artificial Intelligence.
Similar Articles
MolmoAct2: Action Reasoning Models for Real-world Deployment
Allen AI releases MolmoAct2, an open-weight Vision-Language-Action model designed for real-world robotic deployment, featuring new datasets, an open action tokenizer, and adaptive reasoning to reduce latency.
AllenAI has been iterating on their MolmoAct2 models for robotics
AllenAI has released open-source MolmoAct2 models for robot control, with multiple fine-tuned versions for different tasks, including full datasets and training code.
MolmoMotion: Language-guided 3D motion forecasting
MolmoMotion is a new language-guided 3D motion forecasting model that predicts future 3D point trajectories from video frames and action descriptions, achieving stronger performance than existing methods. Alongside the model, a large dataset (MolmoMotion-1M) and a benchmark (PointMotionBench) are released.
AllenAI releases MolmoMotion vision models for predicting future motion based on short frame history
AllenAI releases MolmoMotion, a vision model designed to predict future motion based on a short history of frames.
SpatialAct: Probing Spatial Reasoning-to-Action Capabilities of VLM Agents in 3D Scenes
SpatialAct is a new simulator-grounded benchmark that probes whether VLM agents can perform coherent spatial reasoning and translate it into actions in 3D environments across multi-turn feedback settings. Experiments reveal a significant reasoning-to-action gap, with current VLMs struggling to maintain spatial beliefs and produce reliable actions despite performing well on isolated reasoning tasks.