robot-control

#robot-control

GigaWorld-Policy-0.5: A Faster and Stronger WAM Empowered by AutoResearch

Hugging Face Daily Papers ↗ · 2026-07-15 Cached

GigaWorld-Policy-0.5 is an enhanced World Action Model for robot control that improves training and inference efficiency through a Mixed Action-Conditioned World Modeling strategy and a Mixture-of-Transformers architecture, achieving 85ms latency on a local RTX 4090.

0 favorites 0 likes

#robot-control

@rohanpaul_ai: Most video-action robot models are a content-creation video generator with an action module attached. LingBot-VA 2.0 fr…

X AI KOLs Timeline ↗ · 2026-07-13 Cached

LingBot-VA 2.0 is a video-action foundation model trained from scratch for robot control, achieving 225 Hz closed-loop execution with 13B parameters (1.9B active per token) and outperforming prior models on RoboTwin 2.0.

0 favorites 0 likes

#robot-control

Any thoughts on this robot picking objects off a moving conveyor belt at 1x?

Reddit r/artificial ↗ · 2026-07-10

A robot using the LingBot-VA 2.0 video-action model picks objects off a moving conveyor belt in real-time at 1x speed, predicting future movements rather than reacting only to the current frame.

0 favorites 0 likes

#robot-control

I watched a robot keep up with a live air-hockey puck at real speed, and it predicts the play instead of just reacting

Reddit r/ArtificialInteligence ↗ · 2026-07-10

The article discusses the shift from reactive to prediction-based robot control, highlighted by the LingBot-VA 2.0 model which can keep up with fast-moving objects like an air-hockey puck and learn from few demonstrations.

0 favorites 0 likes

#robot-control

Scaling Mixture-of-Experts Video Pretraining for Embodied Intelligence

Hugging Face Daily Papers ↗ · 2026-07-08 Cached

LingBot-Video presents a DiT-based video pretraining framework with Mixture-of-Experts architecture, specialized data augmentation, and multi-dimensional reward system for embodied intelligence applications.

0 favorites 0 likes

#robot-control

PhysDrift: Bridging the Embodiment Gap in Humanoid Co-Speech Motion Generation

arXiv cs.AI ↗ · 2026-06-20 Cached

This paper identifies an embodiment gap in humanoid co-speech motion generation caused by human-centric pipelines, and proposes PhysDrift, an embodiment-aware framework that directly predicts executable humanoid joint trajectories from speech, improving speech-motion alignment and physical plausibility.

0 favorites 0 likes

#robot-control

ImageWAM: Do World Action Models Really Need Video Generation, or Just Image Editing?

Hugging Face Daily Papers ↗ · 2026-06-17 Cached

ImageWAM proposes replacing video generation with pretrained image editing models in world action models for robot control, achieving superior performance while reducing FLOPs to 1/6 and latency to 1/4 of video-based approaches.

0 favorites 0 likes

#robot-control

LaWAM: Latent World Action Models for Efficient Dynamics-Aware Robot Policies

Hugging Face Daily Papers ↗ · 2026-06-14 Cached

LaWAM enables efficient robot control by predicting compact latent visual subgoals instead of expensive video generation, achieving state-of-the-art success rates with up to 24x lower latency than pixel-space world action models.

0 favorites 0 likes

#robot-control

World-Language-Action Model for Unified World Modeling, Language Reasoning, and Action Synthesis

Hugging Face Daily Papers ↗ · 2026-06-04 Cached

This paper introduces World-Language-Action (WLA) models, embodied foundation models that jointly predict textual subtasks, subgoal images, and robot actions from text, images, and robot states, achieving state-of-the-art multi-task and long-horizon learning in simulated and real-world environments.

0 favorites 0 likes

#robot-control

AllenAI has been iterating on their MolmoAct2 models for robotics

Reddit r/LocalLLaMA ↗ · 2026-05-15

AllenAI has released open-source MolmoAct2 models for robot control, with multiple fine-tuned versions for different tasks, including full datasets and training code.

0 favorites 0 likes

#robot-control

@GoogleDeepMind: We teamed up with @BostonDynamics to power their robot Spot with Gemini Robotics embodied reasoning models. This means …

X AI KOLs ↗ · 2026-04-16 Cached

Google DeepMind partnered with Boston Dynamics to integrate Gemini Robotics embodied reasoning models into their Spot robot, enabling improved environmental understanding, object identification, and command following for tasks like tidying rooms.

0 favorites 0 likes

robot-control

Submit Feedback