Learning complex goals with iterated amplification

OpenAI Blog 10/22/18, 07:00 AM Papers

Summary

OpenAI presents iterated amplification, a method for training AI systems on complex tasks by recursively decomposing them into smaller subtasks that humans can judge and solve, building up training signals from scratch through iterative composition.

We’re proposing an AI safety technique called iterated amplification that lets us specify complicated behaviors and goals that are beyond human scale, by demonstrating how to decompose a task into simpler sub-tasks, rather than by providing labeled data or a reward function. Although this idea is in its very early stages and we have only completed experiments on simple toy algorithmic domains, we’ve decided to present it in its preliminary state because we think it could prove to be a scalable approach to AI safety.

Original Article

View Cached Full Text

Cached at: 04/20/26, 02:46 PM

# Learning complex goals with iterated amplification Source: [https://openai.com/index/learning-complex-goals-with-iterated-amplification/](https://openai.com/index/learning-complex-goals-with-iterated-amplification/) Iterated amplification is a method for generating a training signal for the latter types of tasks, under certain assumptions\. Namely, although a human can’t perform or judge the whole task directly, we assume that a human can, given a piece of the task, identify clear smaller components of which it’s made up\. For example, in the networked computer example, a human could break down “defend a collection of servers and routers” into “consider attacks on the servers”, “consider attacks on the routers”, and “consider how the previous two attacks might interact”\. Additionally, we assume a human can do very small instances of the task, for example “identify if a specific line in a log file is suspicious”\. If these two things hold true, then we can build up a training signal for big tasks from human training signals for small tasks, using the human to coordinate their assembly\. In our implementation of amplification, we start by sampling small subtasks and training the AI system to do them by soliciting demonstrations from humans \(who can do these small tasks\)\. We then begin sampling slightly larger tasks, solving them by asking humans to break them up into small pieces, which AI systems trained from the previous step can now solve\. We use the solutions to these slightly harder tasks, which were obtained with human help, as a training signal to train AI systems to solve these second\-level tasks directly \(without human help\)\. We then continue to further composite tasks, iteratively building up a training signal as we go\. If the process works, the end result is a totally automated system that can solve highly composite tasks despite starting with no direct training signal for those tasks\. This process is somewhat similar to[expert iteration⁠\(opens in a new window\)](https://arxiv.org/pdf/1705.08439.pdf)\(the method used in[AlphaGo Zero⁠\(opens in a new window\)](https://www.nature.com/articles/nature24270)\), except that expert iteration reinforces an existing training signal, while iterated amplification builds up a training signal from scratch\. It also has features in common with[several⁠\(opens in a new window\)](https://arxiv.org/pdf/1807.04640.pdf)[recent⁠\(opens in a new window\)](https://people.eecs.berkeley.edu/~dawnsong/papers/iclr_2017_recursion.pdf)[learning algorithms⁠\(opens in a new window\)](https://arxiv.org/abs/1611.02401)that use problem decomposition on\-the\-fly to solve a problem at test time, but differs in that it operates in settings where there is no prior training signal\.

Learning complex goals with iterated amplification

Similar Articles

Learning a hierarchy

Learning from human preferences

@zachlloydtweets: https://x.com/zachlloydtweets/status/2069428152338665622

Learning to communicate

Planning for AGI and beyond

Submit Feedback