AllenAI has been iterating on their MolmoAct2 models for robotics

Reddit r/LocalLLaMA Models

Summary

AllenAI has released open-source MolmoAct2 models for robot control, with multiple fine-tuned versions for different tasks, including full datasets and training code.

r/AllenAI is cooking with MolmoAct2, a 5B vision-language-action model for robot control. They keep releasing new fine-tunes on different kinds of robotics datasets, including (but not limited to, and they keep releasing new ones): * https://huggingface.co/allenai/MolmoAct2-LIBERO - general robotics tasks * https://huggingface.co/allenai/MolmoAct2-DROID - interactive robotics tasks * https://huggingface.co/allenai/MolmoAct2-BimanualYAM - absolute joint-pose control * https://huggingface.co/allenai/MolmoAct2-SO100_101 - also absolute joint-pose control AllenAI has released these as ***fully open source*** models, publishing not only their weights but also their **complete** training datasets (including pretraining), their training software source code, and technical papers describing the theory, training, and assessments of these models. If anyone is fiddling with robots controlled via LLM inference, you should give MolmoAct2 models a look.
Original Article

Similar Articles

MolmoAct 2

Product Hunt

MolmoAct 2 is an open robotics model that reasons in 3D space before taking actions, developed by the Allen Institute for Artificial Intelligence.

tencent/HY-Embodied-0.5

Hugging Face Models Trending

Tencent releases HY-Embodied-0.5, a suite of foundation models designed for embodied AI agents featuring a Mixture-of-Transformers (MoT) architecture with efficient 2B and powerful 32B variants for real-world robot control and spatial-temporal reasoning.

Liquid AI reveals 8B-A1B MoE trained on 38T

Hacker News Top

Liquid AI released LFM2.5-8B-A1B, an edge MoE model trained on 38T tokens with a 128K context window, improved tool calling, and reasoning capabilities, available on Hugging Face.