Asymmetric actor critic for image-based robot learning

OpenAI Blog 10/18/17, 07:00 AM Papers

Summary

OpenAI proposes an asymmetric actor-critic method for robot learning that leverages full state observability in simulators to train policies that operate on partial observations (RGBD images), enabling effective sim-to-real transfer without real-world training data.

No content available

Original Article Export to Word Export to PDF

View Cached Full Text

Cached at: 04/20/26, 02:46 PM

# Asymmetric actor critic for image-based robot learning Source: [https://openai.com/index/asymmetric-actor-critic-for-image-based-robot-learning/](https://openai.com/index/asymmetric-actor-critic-for-image-based-robot-learning/) ## Abstract Deep reinforcement learning \(RL\) has proven a powerful technique in many sequential decision making domains\. However, Robotics poses many challenges for RL, most notably training on a physical system can be expensive and dangerous, which has sparked significant interest in learning control policies using a physics simulator\. While several recent works have shown promising results in transferring policies trained in simulation to the real world, they often do not fully utilize the advantage of working with a simulator\. In this work, we exploit the full state observability in the simulator to train better policies which take as input only partial observations \(RGBD images\)\. We do this by employing an actor\-critic training algorithm in which the critic is trained on full states while the actor \(or policy\) gets rendered images as input\. We show experimentally on a range of simulated tasks that using these asymmetric inputs significantly improves performance\. Finally, we combine this method with domain randomization and show real robot experiments for several tasks like picking, pushing, and moving a block\. We achieve this simulation to real world transfer without training on any real world data\.

Asymmetric actor critic for image-based robot learning

Similar Articles

Sim-to-real transfer of robotic control with dynamics randomization

Robots that learn

Generalizing from simulation

Competitive self-play

Adversarial attacks on neural network policies

Submit Feedback