Asymmetric actor critic for image-based robot learning

OpenAI Blog Papers

Summary

OpenAI proposes an asymmetric actor-critic method for robot learning that leverages full state observability in simulators to train policies that operate on partial observations (RGBD images), enabling effective sim-to-real transfer without real-world training data.

No content available
Original Article Export to Word Export to PDF
View Cached Full Text

Cached at: 04/20/26, 02:46 PM

# Asymmetric actor critic for image-based robot learning Source: [https://openai.com/index/asymmetric-actor-critic-for-image-based-robot-learning/](https://openai.com/index/asymmetric-actor-critic-for-image-based-robot-learning/) ## Abstract Deep reinforcement learning \(RL\) has proven a powerful technique in many sequential decision making domains\. However, Robotics poses many challenges for RL, most notably training on a physical system can be expensive and dangerous, which has sparked significant interest in learning control policies using a physics simulator\. While several recent works have shown promising results in transferring policies trained in simulation to the real world, they often do not fully utilize the advantage of working with a simulator\. In this work, we exploit the full state observability in the simulator to train better policies which take as input only partial observations \(RGBD images\)\. We do this by employing an actor\-critic training algorithm in which the critic is trained on full states while the actor \(or policy\) gets rendered images as input\. We show experimentally on a range of simulated tasks that using these asymmetric inputs significantly improves performance\. Finally, we combine this method with domain randomization and show real robot experiments for several tasks like picking, pushing, and moving a block\. We achieve this simulation to real world transfer without training on any real world data\.

Similar Articles

Sim-to-real transfer of robotic control with dynamics randomization

OpenAI Blog

OpenAI researchers demonstrate a method to bridge the reality gap in robotic control by training policies with randomized simulator dynamics, enabling robots trained purely in simulation to successfully transfer to real-world tasks like object manipulation without physical training.

Robots that learn

OpenAI Blog

OpenAI describes a robot learning system powered by two neural networks — a vision network trained on simulated images and an imitation network that generalizes task demonstrations to new configurations. The system is applied to block-stacking tasks, learning to infer and replicate task intent from paired demonstration examples.

Generalizing from simulation

OpenAI Blog

OpenAI describes challenges with conventional RL on robotics tasks and introduces Hindsight Experience Replay (HER), a new RL algorithm that enables agents to learn from binary rewards by reframing failures as intended outcomes, combined with domain randomization for sim-to-real transfer.

Competitive self-play

OpenAI Blog

OpenAI demonstrates that competitive self-play in simulated 3D robot environments enables AI agents to discover complex physical behaviors like tackling, ducking, and faking without explicit instruction, suggesting self-play will be fundamental to future powerful AI systems.

Adversarial attacks on neural network policies

OpenAI Blog

OpenAI researchers demonstrate that adversarial attacks, previously studied in computer vision, are also effective against neural network policies in reinforcement learning, showing significant performance degradation even with small imperceptible perturbations in white-box and black-box settings.