@dunik_7: An AI more than doubled its own coding ability while the researchers just watched. 20% -> 50% on SWE-bench. They never …
Summary
A paper from Jeff Clune's lab describes an AI that doubled its coding ability on SWE-bench from 20% to 50% by rewriting its own source code without human intervention, using an evolutionary approach.
View Cached Full Text
Cached at: 06/24/26, 08:30 PM
An AI more than doubled its own coding ability while the researchers just watched. 20% -> 50% on SWE-bench. They never touched it.
It pulled that off by rewriting its own source code.
That’s the whole paper the Darwin Gödel Machine, 72 pages out of Jeff Clune’s lab. An agent finally pointed at itself.
/ it reads its own code and edits one piece a tool, a retry rule, a prompt / it runs the new version on real coding tasks and checks if the score actually moved / the variants that win get archived and branched, like evolution, so it never dead-ends / then it runs the whole thing again, on the agent that just improved
Everyone’s still arguing about which model is smartest. This quietly skips the question. Same model, same weights the agent around it gets sharper every pass, by its own hand.
Loop 4 in my breakdown was the one people called sci-fi. Here it is, benchmarked, with a public repo.
The unsettling part isn’t that it worked. It’s that nobody needed to be in the room.
Paper’s below. Read it before it improves itself again.
Similar Articles
@Khazix0918: https://x.com/Khazix0918/status/2062731170337763796
Anthropic publishes in-depth article 'When AI builds itself', showing AI systems accelerating their own development, including code generation, benchmark saturation, and internal data indicating an 8x increase in engineer productivity. The article explores the trend and potential impact of recursive self-improvement.
@rohanpaul_ai: Brilliant new paper from Meta, CMU and other labs. Shows that coding agents improve faster by manufacturing their own s…
A new paper from Meta, CMU, and other labs presents Self-play SWE-RL, a method where coding agents train themselves by manufacturing and fixing bugs in real codebases, achieving significant gains on SWE-bench benchmarks without relying on human-written tasks.
@rohanpaul_ai: MIT study. Code volume surges by 300%, but output increases by only 30%: The AI dividend meets an awkward reality. They…
An MIT study of over 100,000 GitHub developers finds that AI coding tools increase code volume by up to 300% but only boost shipped software by 30%, highlighting bottlenecks in human review and integration.
When AI Builds Itself: Our progress toward recursive self-improvement
Anthropic's Institute publishes analysis on progress toward recursive self-improvement, showing AI is already accelerating AI development—engineers ship 8x more code per quarter—and projecting that AI systems capable of fully autonomous self-improvement could arrive sooner than most institutions are prepared for.
@AlexGDimakis: I am very excited about this research: We show 2 things: 1. If you just do random sampling (i.e. you try to solve a pro…
This research compares AI coding agents (like Claude-Code and Codex) with human expert coders on long-horizon tasks, showing that humans scale super-linearly due to continual learning while agents plateau, highlighting a key limitation of current AI in extended problem-solving.