AllenAI releases MolmoMotion vision models for predicting future motion based on short frame history

Reddit r/LocalLLaMA 06/21/26, 04:26 AM Models

Summary

AllenAI releases MolmoMotion, a vision model designed to predict future motion based on a short history of frames.

No content available

Original Article

Similar Articles

MolmoMotion: Language-guided 3D motion forecasting

Hugging Face Blog

MolmoMotion is a new language-guided 3D motion forecasting model that predicts future 3D point trajectories from video frames and action descriptions, achieving stronger performance than existing methods. Alongside the model, a large dataset (MolmoMotion-1M) and a benchmark (PointMotionBench) are released.

AllenAI has been iterating on their MolmoAct2 models for robotics

Reddit r/LocalLLaMA

AllenAI has released open-source MolmoAct2 models for robot control, with multiple fine-tuned versions for different tasks, including full datasets and training code.

MolmoAct2: Action Reasoning Models for Real-world Deployment

Papers with Code Trending

Allen AI releases MolmoAct2, an open-weight Vision-Language-Action model designed for real-world robotic deployment, featuring new datasets, an open action tokenizer, and adaptive reasoning to reduce latency.

MolmoAct 2

Product Hunt

MolmoAct 2 is an open robotics model that reasons in 3D space before taking actions, developed by the Allen Institute for Artificial Intelligence.

AnyMo: Scaling Any-Modality Conditional Motion Generation with Masked Modeling

Hugging Face Daily Papers

This paper introduces AnyMo, a unified multimodal framework for human motion generation that combines a Residual FSQ-based motion tokenizer with a scalable masked modeling transformer, along with the OmniHuMo dataset of over 5,000 hours of motion data to enable high-quality synthesis under arbitrary modality combinations.