AllenAI has been iterating on their MolmoAct2 models for robotics

Reddit r/LocalLLaMA 05/15/26, 09:30 PM Models

Summary

AllenAI has released open-source MolmoAct2 models for robot control, with multiple fine-tuned versions for different tasks, including full datasets and training code.

r/AllenAI is cooking with MolmoAct2, a 5B vision-language-action model for robot control. They keep releasing new fine-tunes on different kinds of robotics datasets, including (but not limited to, and they keep releasing new ones): * https://huggingface.co/allenai/MolmoAct2-LIBERO - general robotics tasks * https://huggingface.co/allenai/MolmoAct2-DROID - interactive robotics tasks * https://huggingface.co/allenai/MolmoAct2-BimanualYAM - absolute joint-pose control * https://huggingface.co/allenai/MolmoAct2-SO100_101 - also absolute joint-pose control AllenAI has released these as ***fully open source*** models, publishing not only their weights but also their **complete** training datasets (including pretraining), their training software source code, and technical papers describing the theory, training, and assessments of these models. If anyone is fiddling with robots controlled via LLM inference, you should give MolmoAct2 models a look.

Original Article

Similar Articles

MolmoAct2: Action Reasoning Models for Real-world Deployment

Papers with Code Trending

Allen AI releases MolmoAct2, an open-weight Vision-Language-Action model designed for real-world robotic deployment, featuring new datasets, an open action tokenizer, and adaptive reasoning to reduce latency.

MolmoAct 2

Product Hunt

MolmoAct 2 is an open robotics model that reasons in 3D space before taking actions, developed by the Allen Institute for Artificial Intelligence.

tencent/HY-Embodied-0.5

Hugging Face Models Trending

Tencent releases HY-Embodied-0.5, a suite of foundation models designed for embodied AI agents featuring a Mixture-of-Transformers (MoT) architecture with efficient 2B and powerful 32B variants for real-world robot control and spatial-temporal reasoning.

Liquid AI reveals 8B-A1B MoE trained on 38T

Hacker News Top

Liquid AI released LFM2.5-8B-A1B, an edge MoE model trained on 38T tokens with a 128K context window, improved tool calling, and reasoning capabilities, available on Hugging Face.

@DivyanshT91162: Open-source AI is getting dangerously good AntLingAGI just dropped Ring-2.6-1T… a TRILLION-parameter OSS model built fo…

X AI KOLs Timeline

AntLingAGI released Ring-2.6-1T, a trillion-parameter open-source AI model designed for long-horizon workflows and real-world coding tasks, achieving impressive benchmarks on Tau2-Bench, GPQA Diamond, and ClawEval.