Solving Rubik’s Cube with a robot hand

OpenAI Blog 10/15/19, 07:00 AM News

robotics reinforcement-learning sim-to-real domain-randomization openai neural-networks

Summary

OpenAI developed a robot hand capable of solving a Rubik's Cube using a novel technique called Automatic Domain Randomization (ADR), which progressively increases simulation difficulty to enable effective transfer of learned behaviors from simulation to the real world.

We’ve trained a pair of neural networks to solve the Rubik’s Cube with a human-like robot hand. The neural networks are trained entirely in simulation, using the same reinforcement learning code as OpenAI Five paired with a new technique called Automatic Domain Randomization (ADR). The system can handle situations it never saw during training, such as being prodded by a stuffed giraffe. This shows that reinforcement learning isn’t just a tool for virtual tasks, but can solve physical-world problems requiring unprecedented dexterity.

Original Article Export to Word Export to PDF

View Cached Full Text

Cached at: 04/20/26, 02:55 PM

# Solving Rubik’s Cube with a robot hand Source: [https://openai.com/index/solving-rubiks-cube/](https://openai.com/index/solving-rubiks-cube/) The biggest challenge we faced was to create environments in simulation diverse enough to capture the physics of the real world\. Factors like friction, elasticity and dynamics are incredibly difficult to measure and model for objects as complex as Rubik’s Cubes or robotic hands and we found that domain randomization alone is not enough\. To overcome this, we developed a new method called*Automatic Domain Randomization*\(ADR\), which endlessly generates progressively more difficult environments in simulation\.[B](https://openai.com/index/solving-rubiks-cube/#citation-bottom-B)This frees us from having an accurate model of the real world, and enables the transfer of neural networks learned in simulation to be applied to the real world\. ADR starts with a single, nonrandomized environment, wherein a neural network learns to solve Rubik’s Cube\. As the neural network gets better at the task and reaches a performance threshold, the amount of domain randomization is increased automatically\. This makes the task harder, since the neural network must now learn to generalize to more randomized environments\. The network keeps learning until it again exceeds the performance threshold, when more randomization kicks in, and the process is repeated\.

Solving Rubik’s Cube with a robot hand

Similar Articles

Learning dexterity

Domain randomization and generative models for robotic grasping

Sim-to-real transfer of robotic control with dynamics randomization

OpenAI Robotics Symposium 2019

Generalizing from simulation

Submit Feedback

Similar Articles

Domain randomization and generative models for robotic grasping

Sim-to-real transfer of robotic control with dynamics randomization

OpenAI Robotics Symposium 2019