@dunik_7: An AI more than doubled its own coding ability while the researchers just watched. 20% -> 50% on SWE-bench. They never …

X AI KOLs Timeline 06/24/26, 09:41 AM Papers

self-improving-ai code-generation swe-bench evolution ai-research darwin-godel-machine

Summary

A paper from Jeff Clune's lab describes an AI that doubled its coding ability on SWE-bench from 20% to 50% by rewriting its own source code without human intervention, using an evolutionary approach.

An AI more than doubled its own coding ability while the researchers just watched. 20% -> 50% on SWE-bench. They never touched it. It pulled that off by rewriting its own source code. That's the whole paper the Darwin Gödel Machine, 72 pages out of Jeff Clune's lab. An agent finally pointed at itself. / it reads its own code and edits one piece a tool, a retry rule, a prompt / it runs the new version on real coding tasks and checks if the score actually moved / the variants that win get archived and branched, like evolution, so it never dead-ends / then it runs the whole thing again, on the agent that just improved Everyone's still arguing about which model is smartest. This quietly skips the question. Same model, same weights the agent around it gets sharper every pass, by its own hand. Loop 4 in my breakdown was the one people called sci-fi. Here it is, benchmarked, with a public repo. The unsettling part isn't that it worked. It's that nobody needed to be in the room. Paper's below. Read it before it improves itself again.

Original Article

View Cached Full Text

Cached at: 06/24/26, 08:30 PM

An AI more than doubled its own coding ability while the researchers just watched. 20% -> 50% on SWE-bench. They never touched it.

It pulled that off by rewriting its own source code.

That’s the whole paper the Darwin Gödel Machine, 72 pages out of Jeff Clune’s lab. An agent finally pointed at itself.

/ it reads its own code and edits one piece a tool, a retry rule, a prompt / it runs the new version on real coding tasks and checks if the score actually moved / the variants that win get archived and branched, like evolution, so it never dead-ends / then it runs the whole thing again, on the agent that just improved

Everyone’s still arguing about which model is smartest. This quietly skips the question. Same model, same weights the agent around it gets sharper every pass, by its own hand.

Loop 4 in my breakdown was the one people called sci-fi. Here it is, benchmarked, with a public repo.

The unsettling part isn’t that it worked. It’s that nobody needed to be in the room.

Paper’s below. Read it before it improves itself again.

@dunik_7: An AI more than doubled its own coding ability while the researchers just watched. 20% -> 50% on SWE-bench. They never …

Similar Articles

@Khazix0918: https://x.com/Khazix0918/status/2062731170337763796

@rohanpaul_ai: Brilliant new paper from Meta, CMU and other labs. Shows that coding agents improve faster by manufacturing their own s…

@rohanpaul_ai: MIT study. Code volume surges by 300%, but output increases by only 30%: The AI dividend meets an awkward reality. They…

When AI Builds Itself: Our progress toward recursive self-improvement

@AlexGDimakis: I am very excited about this research: We show 2 things: 1. If you just do random sampling (i.e. you try to solve a pro…

Submit Feedback

Similar Articles

@Khazix0918: https://x.com/Khazix0918/status/2062731170337763796

@rohanpaul_ai: Brilliant new paper from Meta, CMU and other labs. Shows that coding agents improve faster by manufacturing their own s…

@rohanpaul_ai: MIT study. Code volume surges by 300%, but output increases by only 30%: The AI dividend meets an awkward reality. They…

When AI Builds Itself: Our progress toward recursive self-improvement

@AlexGDimakis: I am very excited about this research: We show 2 things: 1. If you just do random sampling (i.e. you try to solve a pro…