robot-manipulation

#robot-manipulation

AEGIS: A Backup Reflex for Physical AI

arXiv cs.AI ↗ · yesterday Cached

AEGIS uses activation-probe early warning to switch to a stronger policy before failures compound in long-horizon robot manipulation, recovering twice as many failures as budget-matched escalation.

0 favorites 0 likes

#robot-manipulation

AHA-WAM:Asynchronous Horizon-Adaptive World-Action Modeling with Observation-Guided Context Routing

Hugging Face Daily Papers ↗ · 2d ago Cached

AHA-WAM is an asynchronous world-action model that uses dual Diffusion Transformers to decouple world prediction from action execution, achieving efficient long-horizon planning and real-time control. It achieves state-of-the-art performance on robotic manipulation tasks with up to 92.8% success on RoboTwin and 78.3% on real-world tasks, while reaching 24.17 Hz closed-loop control.

0 favorites 0 likes

#robot-manipulation

TBD-VLA: Temporal Block Diffusion Vision Language Action Model

Hugging Face Daily Papers ↗ · 5d ago Cached

TBD-VLA introduces a discrete vision-language-action framework that combines block diffusion with autoregressive generation to achieve efficient temporal action modeling and faster inference, significantly outperforming prior VLA approaches in simulation and real-world manipulation tasks.

0 favorites 0 likes

#robot-manipulation

Dream.exe: Can Video Generation Models Dream Executable Robot Manipulation?

Hugging Face Daily Papers ↗ · 6d ago Cached

Dream.exe proposes an evaluation framework that uses robotic manipulation tasks to assess video generation models' understanding of physical reality, finding that visual quality does not predict executable motion accuracy.

0 favorites 0 likes

#robot-manipulation

AFUN: Towards an Affordance Foundation Model for Functionality Understanding

Hugging Face Daily Papers ↗ · 2026-06-01 Cached

AFUN proposes an affordance foundation model that predicts functional masks and 3D motion curves from RGB-D observations and language descriptions, enabling generalizable robot manipulation across diverse environments. The model outperforms baselines on multiple benchmarks and can be deployed for real-world tasks without fine-tuning.

0 favorites 0 likes

#robot-manipulation

RoboSemanticBench: Diagnosing Semantic Grounding in Action Prediction for VLA Models

Hugging Face Daily Papers ↗ · 2026-06-01 Cached

RoboSemanticBench is a benchmark that diagnoses semantic grounding in action prediction for vision-language-action models, revealing that while robots can grasp objects, they fail to select semantically correct targets based on instruction semantics.

0 favorites 0 likes

#robot-manipulation

IntentVLA: Short-Horizon Intent Modeling for Aliased Robot Manipulation

Hugging Face Daily Papers ↗ · 2026-05-14 Cached

IntentVLA is a history-conditioned visual-language-action framework that improves robot imitation learning stability by encoding short-horizon intents from visual observations, addressing challenges from partial observability and ambiguous observations. It also introduces AliasBench, an ambiguity-aware benchmark for evaluating such methods.

0 favorites 0 likes

#robot-manipulation

RoboEvolve: Co-Evolving Planner-Simulator for Robotic Manipulation with Limited Data

Hugging Face Daily Papers ↗ · 2026-05-13 Cached

RoboEvolve is a framework that co-evolves a VLM planner and VGM simulator for robotic manipulation, achieving data efficiency with only 500 unlabeled seed images and robust continual learning.

0 favorites 0 likes

#robot-manipulation

Gemini Robotics On-Device brings AI to local robotic devices

Google DeepMind Blog ↗ · 2025-06-24 Cached

Google DeepMind introduces Gemini Robotics On-Device, an efficient VLA model optimized to run locally on robotic devices, enabling low-latency operation and offline capability while maintaining strong dexterous manipulation and task generalization. The model can be fine-tuned with as few as 50-100 demonstrations and comes with an SDK for developers.

0 favorites 0 likes

#robot-manipulation

Hindsight Experience Replay

OpenAI Blog ↗ · 2017-07-05 Cached

OpenAI presents Hindsight Experience Replay (HER), a technique enabling sample-efficient reinforcement learning from sparse binary rewards without complex reward engineering. It is demonstrated on robotic arm manipulation tasks including pushing, sliding, and pick-and-place, and validated on physical robots.

0 favorites 0 likes

robot-manipulation

Submit Feedback