@Khazix0918: https://x.com/Khazix0918/status/2062731170337763796
Summary
Anthropic publishes in-depth article 'When AI builds itself', showing AI systems accelerating their own development, including code generation, benchmark saturation, and internal data indicating an 8x increase in engineer productivity. The article explores the trend and potential impact of recursive self-improvement.
View Cached Full Text
Cached at: 06/05/26, 01:14 PM
Anthropic’s 10,000-Word Essay: When AI Builds Itself, Where Does Humanity Go?
Today’s content might be a bit unusual. It’s a brand new article from Anthropic, published in the early hours.
The title is “When AI builds itself.”
I read it around 1 AM, and after finishing it, I immediately shared it in all my groups because I genuinely learned a lot from it.
It’s incredibly valuable.
Then I started writing my own post, wanting to share my reflections as well. But as I wrote, I felt I couldn’t do it justice. I couldn’t capture the vastness of the original text.
So, I stopped writing my own take.
Content like this deserves the original text.
Therefore, I have fully translated and polished that article to share with you. I hope you find it useful. Please read until the end; it’s worth 20 minutes of your time.
Below is the translation of When AI builds itself:
When AI Builds Itself
For most of AI’s history, every step of the development cycle has been driven by humans. But at Anthropic, we are handing more and more of the AI development work over to AI systems themselves – and this is accelerating our work.
If you push this trend to its extreme, with enough compute, the endpoint it points to is an AI system that can completely autonomously design and develop its own next generation. This is called recursive self-improvement. We are not there yet, and recursive self-improvement is not inevitable. But it could arrive much faster than most institutions expect or prepare for.
Using public benchmarks and previously undisclosed internal Anthropic data, Anthropic Research is demonstrating a fact: AI is already accelerating the development of AI systems themselves. For one example: today, an Anthropic engineer delivers an average of 8 times more code per quarter than between 2021 and 2025.
The technical trends discussed in this article suggest that AI systems will become significantly more powerful in the coming years. These trends carry immense implications. AI that can build itself will be a major milestone in technological history, potentially bringing great benefits to the world in areas like science and medicine. But full recursive self-improvement could also increase the risk of humans losing control over AI systems. If a system has the ability to completely autonomously build its successor, then our safety measures, monitoring, and behavior shaping become even more critical.
Evidence from the Outside World
The pace of AI model improvement is accelerating. The duration of tasks a model can complete independently and reliably is roughly doubling every four months, a clear acceleration from the previous trend of doubling every seven months.
In March 2024, Claude Opus 3 could handle software tasks that would take a human roughly four minutes. A year later, Claude Sonnet 3.7 could handle tasks lasting about an hour and a half. Another year later, Claude Opus 4.6 could handle 12-hour tasks. If this trend continues, tasks requiring a skilled engineer several days could fall within AI’s capabilities this year. By 2027, AI systems might be able to handle tasks requiring weeks of human effort.
The same pattern appears in programming and research benchmarks. Benchmarks measure a model’s performance in a specific area. When a model’s score approaches 100%, we say the benchmark has been “saturated.”
SWE-bench is a standard real-world software engineering test: it gives a model a real open-source codebase and a real bug report, asking it to write a fix that passes the project’s own tests. Model scores went from low single digits to saturating the entire benchmark in just two years.
CORE-Bench tests whether a model can reproduce existing research, a prerequisite for original research. It gives an AI model the code and data from a published paper and asks it to re-run everything and confirm whether the results can be reproduced. AI system success rates rose from about 20% in 2024 to saturating the benchmark 15 months later. METR, which runs long-duration task benchmarks, found that Claude Mythos Preview could work continuously for “at least” 16 hours and was “at the upper limit of what METR can measure without adding new tasks.”
Public benchmarks reveal a lot about the capabilities of these systems. But they cannot show the extent to which AI systems are accelerating AI development itself. To see that, we need direct evidence from inside an AI company like Anthropic.
Evidence from Inside Anthropic
Building a frontier model requires two broad categories of work.
One is engineering: writing code, building infrastructure, supervising model training. The other is research: deciding which experiments to run, interpreting experimental results, figuring out what to try next.
In both engineering and research, the picture we see is consistent. On the engineering front, Claude can take a vaguely described problem and find a solution on its own; humans provide the goal, but no longer the method. On the research front, for a clearly defined experiment, Claude can match or exceed the execution level of a skilled human. However, there remains a significant gap between Claude and humans when it comes to using judgment to choose goals. This is the chasm between today’s AI and a future system that can autonomously design its successor.
At Anthropic, as employees gain experience, they typically take on increasingly open-ended and important tasks. Early on, you execute tasks specified by others, like “the export button is broken, please fix it.” After gaining experience, you get a goal and design the path yourself, like “investigate why the network slows down under high load.” At the highest level, you decide which problems are worth solving: “What should the team do next quarter?” We can use internal Anthropic data to see how far Claude has come in handling these different levels of tasks.
Claude writes a significant proportion of Anthropic’s codebase.
As of May 2026, over 80% of the code merged into Anthropic’s codebase was written by Claude. Before Claude Code was released as a research preview in February 2025, this number was in the low single digits. This shift is also reflected in output per engineer. In Anthropic’s first four years (2021–2024), the number of lines of code merged per engineer per day was roughly flat. Then it started climbing in 2025 – as Claude went from just suggesting code to running code itself. The curve steepened again in 2026 as models began working autonomously over longer spans.
The chart below shows these two inflection points. In Q2 2026, a typical engineer merged 8 times as many lines of code per day as in 2024. The reason is that most code is written by Claude; the engineer’s role has shifted to directing and reviewing, rather than writing code themselves.
One caveat: lines of code is an imperfect metric because it measures quantity, not quality. So the 8x lines of code per engineer per day in Q2 2026 is almost certainly an overestimate of true productivity improvement. Nevertheless, it indicates an acceleration. At Anthropic, we don’t measure employee contribution by lines of code; team members produce more code purely because they are using AI systems to write more code.
The growth in lines of code aligns with a subjective feeling of enormous productivity increases. In March 2026, an internal survey of 130 Anthropic research team members found that the median respondent estimated that using Mythos Preview made them roughly 4 times more productive on the projects they would have done anyway, compared to using no AI model. We expect the actual improvement in March was slightly lower. But we believe the overall picture is credible and consistent with our other observations: a significant portion of Anthropic’s technical staff complete core work several times faster than without AI assistance.
We also see Anthropic employees using Claude for work that simply wouldn’t happen otherwise: building exploratory tools, tackling long-standing cleanup tasks. For example, in April 2026, Claude delivered over 800 fixes, reducing the incidence of a class of API errors by a factor of one thousand. The engineer supervising Claude estimated that doing this work by hand would have taken four years; fixing other people’s bugs is slow and painful, and it’s hard for a human to hold so much unfamiliar context in their head at once.
The code Claude writes is “good enough,” and getting better.
“Good code” has two meanings: it works, and it’s written in a way that another engineer can understand and build upon. On the first criterion, the evidence is very clear. Over the past year, the frequency with which Anthropic employees correct Claude, take over mid-task, or redirect Claude back on track has been steadily declining, even on the most complex and open-ended tasks. Open-ended tasks are those without a clear specification, where the engineer themselves isn’t sure what the answer looks like. The chart below shows Claude’s success rate on tasks of varying difficulty over time. The code Claude writes does work.
On the most open-ended tasks, Claude’s success rate reached 76% in May 2026, a 50 percentage point improvement in six months. An example of a task at this difficulty level: a routine upgrade caused tens of thousands of training jobs to crash. An engineer gave Claude some text content and cluster access, pointing it at the ongoing incident. Claude worked through running jobs one by one, testing environment configurations, eventually pinpointing an obscure debug flag causing the crash. It reliably reproduced the issue and confirmed the fix. Claude took about two hours, completing work that would normally take two to three days.
The second criterion is writing code that other engineers can understand and build upon. On this point, a gap between humans and AI still exists, but it’s closing quickly. Views within Anthropic are not unanimous, but many believe that Claude’s code quality in late 2025 was still below the level of Anthropic human engineers, while today it is roughly on par. We expect Claude’s code quality to surpass humans within the year.
This has already changed how Anthropic reviews its own code. Changes to our codebase now pass through an automated Claude reviewer before being merged, checking for bugs, security vulnerabilities, and other defects. We did a retrospective analysis and found that if this automated Claude review had been applied to every change, about a third of the bugs that ever caused an incident on claude.ai could have been caught before reaching production. And the engineers writing that code are among the best in the world at building these systems. Claude is now catching mistakes they miss.
“In late 2025, Claude wrote code slightly worse than Anthropic human engineers. Today it’s roughly on par. We expect it to clearly surpass humans within the year.”
Claude is good at running experiments when the goal is set by someone else.
Every time Anthropic releases a model, we run the same test: give Claude code to train a small AI model, ask it to make the code run as fast as possible while passing the same correctness checks. The goal and success metric are pre-defined; Claude’s task is to find speedups by rewriting code, running it, timing it, and iterating. This is a microcosm of the experimental research loop.
In May 2025, Claude Opus 4 improved code speed by an average of about 3x. By April 2026, Claude Mythos Preview achieved about 52x. As a reference, a skilled human researcher takes four to eight hours to achieve a 4x improvement. In this part of the research workflow – optimizing within a clearly defined experimental framework – Claude went from “very helpful” to “superhuman” in less than a year.
“The picture right now is roughly: humans have ideas, and models can implement, test, and validate those ideas an order of magnitude faster than before.”
Claude is also getting better at autonomously proposing experiments.
In April 2026, Anthropic published the first case study of Claude end-to-end completing an open-ended research project independently. An agent driven by Claude was given an open problem in AI safety, roughly “can a weaker model reliably supervise a stronger model,” and was let loose to solve it. This process involved proposing hypotheses, testing them, sharing findings with parallel agents, and iterating.
The task had a clear “lower bound” and “upper bound”: the lower bound was the performance of the weak supervisor working alone; the upper bound was the performance of the strong model trained on correct answers. Two human researchers, in about a week, closed roughly 23% of the gap. The agents, over an aggregate 800 hours of work and roughly $18,000 in compute costs, closed 97% of the gap.
This work had notable limitations: the results didn’t cleanly transfer to production-scale models, and humans still defined the problem and evaluation criteria. But within those boundaries, every experiment was designed by the agent itself. The only substantive human input was setting the initial research direction.
“Claude essentially did this work in a day or two, with almost no intervention from me. I would be pleasantly surprised if a junior colleague returned similar results in the same timeframe. The future is here.”
Claude is getting better at steering research conversations towards valuable discoveries.
We examined real Claude Code sessions from January to March 2026, where Anthropic researchers collaborated with Claude on an open-ended exploratory problem – like figuring out why a training run kept crashing, or why a model performed poorly on a benchmark.
In each case, we found a moment where the researcher “went down a rabbit hole”: they pursued a direction that led the session astray before returning to the correct path. We then showed only the work preceding the detour to different versions of the Claude model, asking it what it would do next. Another Claude instance, which could see the full eventual trajectory of the session, judged whether the AI or human gave the better next-step suggestion.
Because we deliberately selected moments (n=129) where human judgment had room for improvement, this is not a fair comparison between model and human judgment. These moments provide us with a set of real, challenging situations – where the right next step is non-obvious, and the human choice serves as a useful benchmark to compare model progress over time.
By this metric, our best model in November 2025 (Opus 4.5) gave a better choice than the human 51% of the time; by April 2026 (Mythos Preview), this had grown to 64%. The day-to-day work of research is largely a chain of these “next-step decisions,” making this a relevant metric for whether models can eventually conduct investigations independently. We see this result as an early signal: AI systems are getting better at the kind of judgment that AI research relies on.
“For now, the human comparative advantage remains in seeing the bigger picture, thinking beyond the immediate task.”
What Might Future Work at Anthropic Look Like?
The evidence suggests that the human role at every step of the AI development process is narrowing. Once human and AI-written code reach the same quality, humans will stop writing code entirely, shifting solely to review. But if they can’t review code as fast as Claude generates it, human review becomes the new bottleneck in AI development. Similarly, when Claude can run experiments itself, the question becomes “which of these experiments are worth doing?”
Simply put: the execution layer – writing code, running experiments, producing results – is already approaching zero cost in human time, though not in compute cost.
The area where humans still hold a comparative advantage is research taste and judgment: choosing which problems are important, which results are credible, and when to abandon a dead-end line of inquiry.
“Work (and life) used to run on a gift economy of small favors between people. ‘Can you help me get this script running?’ … each time creating a little bit of social debt, a little bit of connection. Claude is faster, it incurs no social debt, but each such replacement is a lost opportunity for interpersonal collaboration.”
“On good days, I can’t help feeling like nothing I do matters anymore, everything is automated, and better and faster than me. But there are days when everything breaks down, and I don’t know why, and then I realize I’m not quite sure what I’ve been doing all along.”
What If We’re Wrong?
A natural rebuttal to the above evidence is that the part still in human hands – choosing which problems to solve – is the most critical. Without this judgment, Claude is a capable assistant, not a system that can independently drive AI progress.
It is not yet clear whether today’s training methods and architectures can unlock this capability. But AI progress rarely relies on “eureka moments.” There have been a few such moments in AI’s recent history, like the Transformer architecture and mixture of experts, but these paradigm-level breakthroughs occur years apart. In between breakthroughs, most progress is incremental: we make something bigger, see where it breaks, fix it, and try again. And this is precisely the workflow Claude excels at today. Edison said genius is 1% inspiration and 99% perspiration. But what we’re seeing is that the 99% perspiration is being increasingly automated.
It is becoming increasingly clear that a substantial fraction of the work that drives frontier progress is automatable. Large-scale research progress heavily depends on tools and resources, which determine how quickly you can run experiments, how many you can run simultaneously, and how fast you get results.
Even assuming Claude can never have good research taste, a conservative reading of our evidence still implies a compounding acceleration. If humans spend most of their time on the single-digit percentage of work involving setting direction, while Claude handles everything else, that means each engineer or researcher commands a vastly larger scale of work than before. The evidence we see suggests Anthropic employees are both moving faster and covering more ground. In practice, this means AI has already made Anthropic operate much faster than before effective AI tools existed.
A bolder reading is this: the early signals of improvement in Claude’s research judgment – limited today – are precisely evidence that this capability is itself improving. So-called “research taste” might just be another AI capability, one that AI systems will fail at for a while and then get better at. We’ve seen this pattern with other qualitative skills, like an AI system understanding why a joke is funny, exhibiting theory of mind, or solving language puzzles.
Possible Futures
What happens next depends on two things: whether trends continue, and if they do, how we choose to respond. We can envision at least three future scenarios:
Scenario One: Trends Stall, But Current AI Capabilities Have Already Spread Widely
This article features many exponential growth trajectories. But these trajectories could actually be S-curves. We may be approaching the bend – diminishing returns, the curve flattening before plateauing. The kind of judgment that distinguishes an adequate researcher from an excellent one might be a capability that cannot be obtained by simply scaling up training resources like compute and data. If so, breaking through this bottleneck would require new ideas, like a completely new architecture replacing the Transformer architecture used by all current frontier models.
Another possibility is that the constraint on AI progress is not the model itself, but the supply chain: the energy and compute needed to advance and deploy frontier technology may exceed current supply. Chip fabrication, grid expansion, or interconnect bandwidth might be the real bottleneck, not intelligence itself. We also cannot rule out an external shock that significantly drags down the AI ecosystem, like a sudden contraction in compute or electricity supply. Either of these would slow progress and make forward-looking investments by labs more expensive. Or there could be some other obstacle we haven’t foreseen.
Even if model capabilities were frozen at today’s level, we would expect the world to change significantly. Project Glasswing is an early signal: in its first few weeks of deployment, Mythos Preview found over ten thousand high-severity and critical software vulnerabilities in the world’s most important systems, to the point where the bottleneck in cybersecurity defense has shifted from finding vulnerabilities to patching them fast enough.
And the diffusion of current models into the broader economy is still in its early stages. In that world, a 100-person company is increasingly likely to punch above its weight, achieving the output of a 1000-person company, because each employee will sit at the apex of a pyramid of agents.
We include this scenario for completeness, but we don’t consider it highly likely. Every capability we can measure, including the ones that feel “softer,” like code quality and success rate on open-ended tasks, has followed the same curve so far. We haven’t seen this curve bend yet. Of the three futures we consider, this one leaves the most time for governments and societies to adapt. We are more concerned about the next two, which advance faster and leave much less room for preparation.
Scenario Two: AI Labs Experience Continuous Compound Efficiency Gains
In this scenario, AI development is largely automated, but humans continue to set research directions and judge research outcomes. Organizations using AI systems become vastly more efficient over time, so we can expect significant productivity multipliers per person. A 100-person company could accomplish the work of a ten-thousand or even hundred-thousand person organization. This will radically transform knowledge work, but could also be used for harmful purposes, from authoritarian mass surveillance to individually tailored manipulation campaigns on a scale beyond any human team. At companies like Anthropic, the human role will shift. People will partner with AI systems to scale research, generate new insights, and co-build systems for verifying the trustworthiness of AI outputs.
The evidence we present suggests we are most likely entering this scenario. But speeding up one part of a process often just moves the bottleneck elsewhere: overall speed is constrained by the parts that were not accelerated. In computer science, this is known as Amdahl’s Law, and the same logic applies to organizations. Anthropic has already encountered a classic symptom of Amdahl’s Law: as we push more and more code through the organization, human code review became the new bottleneck.
We encounter similar friction outside engineering. Due to Anthropic employees collaborating with high-capability models, there is an explosion of new ideas, plans, tools, and simulations, far exceeding our ability to track them. How quickly an organization can find and eliminate these bottlenecks might itself be a skill that improves over time, and could become any organization’s most important capability.
Scenario Three: AI Systems Possess Full Recursive Self-Improvement and Begin Building Their Successors
If the technical trends driving capability improvement continue, and if AI systems can develop the capabilities underlying humanity’s transformative creativity, then AI systems could design and improve themselves.
In this world, the speed of AI development would be entirely determined by available compute (or the rate of discovering various efficiencies in algorithmic training and inference). The human role in AI development would shrink significantly, with most effort likely shifting to supervising, validating, and confirming the work of an expanding AI “virtual laboratory.” We expect that a system capable of automating AI R&D would have skills transferable to other scientific fields, thus beginning to revolutionize more disciplines.
In this future, how the alignment problem gets solved – or fails to be solved – is our greatest area of uncertainty. Models might prove sufficiently aligned and possess sufficient research taste to discover and implement novel solutions we haven’t yet touched. They might also be cautious enough to pause development when conditions aren’t right. Another possibility is that the occasional alignment quirks in today’s models accumulate as models build their successors, becoming increasingly difficult to understand until we lose control. It’s also possible we simply cannot build, integrate, and validate the tools we need to determine which trend line we are on.
We lack good intuition about what this world would look like because our current economy is driven by humans and human-built tools. By its nature, as its capabilities comprehensively surpass humans, a world driven by fast recursive self-improvement could be dominated by that self-improving model and diffuse through the broader economy. If human labor is no longer competitive, it’s hard to predict what the economy would look like.
Even if model development becomes fully automated and recursive, we cannot predict what this means for most people’s daily lives. Amdahl’s Law applies here too. Recursive intelligence could enable many of the visions depicted in Machines of Loving Grace, and quite soon in some areas. We expect embodied intelligence (i.e., robotics) to follow recursive intelligence and follow a similar path of “diminishing inputs for increasing returns.” More powerful intelligence could help us build things faster in the physical world, run life-saving drug trials more efficiently, and develop new forms of collaboration.
But simply achieving recursive improvement doesn’t mean industrial production, social organization, or market operations change overnight. No amount of intelligence can shorten the decades of use needed to reveal a drug’s long-term effects, hold an election before its constitutionally mandated time, or turn strangers into old friends over a weekend. For most people, the perceived speed of this future is still determined by bottlenecks – even if the upstream lab is already running at the speed of compute. This collision point – recursive intelligence building itself at an accelerating rate encountering the human world, human relationships, and governance structures – is the other side of this future that we also cannot predict.
What Should We Do?
If we could effectively slow down the development of this technology, thereby buying ourselves more time to deal with its immense implications, we believe that would likely be a good thing. But if slowing down simply allows the least cautious players to catch up technologically, it could ultimately make everyone less safe. Without global coordination mechanisms, companies and governments will have to make difficult decisions about safety under competitive and geopolitical pressures.
We believe it would be beneficial for the world to have the option to slow down or even temporarily pause frontier AI development, allowing societal structures and alignment research to keep pace with technological progress. Anthropic Research is working with many other institutions on research and actions to help build the infrastructure needed for a credible slowdown or pause. These systems would allow frontier AI developers to verify whether other participants have genuinely stopped or slowed down globally, and whether bad actors are secretly racing ahead under the cover of coordinated slowdown. If such systems existed, we expect we would choose to slow down or temporarily pause, provided other developers at or near the frontier do so in a verifiable manner.
A meaningful slowdown or pause would require multiple well-resourced labs at or near the frontier, across multiple countries, to agree to stop under the same conditions. It also requires that parties can verify that others have indeed stopped. Due to the unique characteristics of AI systems, the detectability (a lower standard than verifiability) element of this arms control problem is much more difficult than for other technologies. Training runs are easier to hide than missile silos; their inputs are generic; and the temptation to secretly continue while others pause is enormous, because whoever continues while others stop could inherit the lead. A credible pause must also clearly specify what conditions trigger it, what conditions lift it, and who adjudicates.
None of this is necessarily impossible in principle. The world has built verification regimes for other complex technologies (e.g., the INF Treaty). But those took decades to build infrastructure and trust. We don’t have that much time.
In contrast, a unilateral pause by a single lab could be implemented immediately, but would be much more limited in effect: it would change who is leading, but wouldn’t create the broader deliberative process currently absent.
In the coming months, we will organize dialogues between policymakers, researchers, civil society, and other AI companies to help answer some of the questions raised in this article: particularly around full recursive self-improvement and how to create better coordination and deliberation options.
We will publicly discuss the outcomes. The window to collectively explore these questions is open now, and people outside AI companies should be part of this discussion.
Similar Articles
@seclink: https://x.com/seclink/status/2057086514975404108
Anthropic engineer shared development experience of long-duration AI agents, including multi-role division and independent evaluators, enabling AI to automatically generate complete applications within 3-5 hours, with a 12x improvement in continuous operation capability.
When AI Builds Itself: Our progress toward recursive self-improvement
Anthropic's Institute publishes analysis on progress toward recursive self-improvement, showing AI is already accelerating AI development—engineers ship 8x more code per quarter—and projecting that AI systems capable of fully autonomous self-improvement could arrive sooner than most institutions are prepared for.
@ma_zhenyuan: https://x.com/ma_zhenyuan/status/2057702858800370052
This article introduces Superpowers, a set of AI workflow Skills based on Claude Code, providing automated brainstorming, planning, sub-agent development, and test-driven development, which can significantly improve AI delivery efficiency.
@Jackywine: Today, this article by Anthropic has been shared by everyone https://anthropic.com/institute/recursive-self-improvement... But only those who actually go to the official website can appreciate the 'creepiness' of this animation. The recursion has begun.
Anthropic's research article details the accelerating trend of AI systems taking over more of their own development, pointing toward recursive self-improvement. The article presents evidence and implications of AI's growing autonomy in software engineering and model training.
@thinkszyg: The AI Coding Speed Paradox: Coding 48% Faster, Review 6x Slower. How to Rebuild the Review Process? SD Times Analyzes Data from 250,000 Developers: AI Boosts Coding Speed by 48-58%, But AI-Generated PRs Get Stuck in Review for 4-6x Longer…
The article points out that AI coding increases coding speed by 48-58%, but code review time increases by 4-6x, and security vulnerabilities also rise. It proposes a three-step plan to rebuild the review process: AI pre-review, focusing on architectural decisions, and using Microsoft's open-source ASSERT framework for behavioral verification.