Robots that learn

OpenAI Blog 05/16/17, 07:00 AM Papers

robotics imitation-learning neural-networks computer-vision simulation openai

Summary

OpenAI describes a robot learning system powered by two neural networks — a vision network trained on simulated images and an imitation network that generalizes task demonstrations to new configurations. The system is applied to block-stacking tasks, learning to infer and replicate task intent from paired demonstration examples.

We’ve created a robotics system, trained entirely in simulation and deployed on a physical robot, which can learn a new task after seeing it done once.

Original Article

View Cached Full Text

Cached at: 04/20/26, 02:56 PM

# Robots that learn Source: [https://openai.com/index/robots-that-learn/](https://openai.com/index/robots-that-learn/) The system is powered by two neural networks: a vision network and an imitation network\. The vision network ingests an image from the robot’s camera and outputs state representing the positions of the objects\. As[before⁠\(opens in a new window\)](https://blog.openai.com/spam-detection-in-the-physical-world/), the vision network is trained with hundreds of thousands of simulated images with different perturbations of lighting, textures, and objects\. \(The vision system is never trained on a real image\.\) The imitation network observes a demonstration, processes it to infer the intent of the task, and then accomplishes the intent starting from another starting configuration\. Thus, the imitation network must generalize the demonstration to a new setting\. But how does the imitation network know how to generalize? The network learns this from the distribution of training examples\. It is trained on dozens of different tasks with thousands of demonstrations for each task\. Each training example is a pair of demonstrations that perform the same task\. The network is given the entirety of the first demonstration and a single observation from the second demonstration\. We then use supervised learning to predict what action the demonstrator took at that observation\. In order to predict the action effectively, the robot must learn how to infer the relevant portion of the task from the first demonstration\. Applied to block stacking, the training data consists of pairs of trajectories that stack blocks into a matching set of towers in the same order, but start from different start states\. In this way, the imitation network learns to match the demonstrator’s ordering of blocks and size of towers without worrying about the relative location of the towers\.

Robots that learn

Similar Articles

One-shot imitation learning

OpenAI Robotics Symposium 2019

Roboschool

AI coding agents can autonomously direct robot training

Ingredients for robotics research

Submit Feedback

Similar Articles

OpenAI Robotics Symposium 2019

AI coding agents can autonomously direct robot training

Ingredients for robotics research