This article discusses how AI coding agent loops can inadvertently learn and propagate deprecated code patterns from existing codebases, leading to technical debt despite appearing successful.
Steinberger posted over the weekend about how he doesn't write code anymore, just designs agent loops. Boris Cherny from Anthropic said basically the same thing. He doesn't prompt Claude, just creates loops and they handle the rest. If you're at Anthropic and tokens are essentially free, sure, let it loop all day. Most of us are paying real money for every file the agent reads. Full disclosure I run a software delivery company and we do a lot of brownfield work, so this is what I'm seeing from that side. We set up agent loops on a client's core product last quarter. The agents were fast. Four features shipped in a week. PRs looked clean, CI passed, the team was excited about it. Then security review caught it. All four features had used a pattern the team had been trying to get rid of for two years. The old pattern was in something like 40+ files across the codebase. The new one existed in maybe 6. The agent looked at what was most common and followed it. I mean, why wouldn't it. It doesn't know your team has a migration plan. It doesn't read your architecture decision records. It reads your code. And your code told it the deprecated way was the right way because that's what most of the codebase looked like. Nobody caught it in code review either because every PR was functional. The code worked... It was just wrong in a way you'd only notice if you knew the team was actively moving away from that pattern. On a greenfield project the agent only has your prompt and system instructions to go on. You control the context. On brownfield the codebase is the context and it drowns out whatever you put in your prompt. 40 files beat one paragraph of instructions every single time. Everyone throws around the "88% of agent projects fail before production" stat. I think there's a worse number that nobody is tracking. How many reach production and succeed by every visible metric while putting back the same tech debt the team was trying to pay down. Because that's what I keep seeing. Features ship, velocity looks great in the sprint review, and the whole time the codebase is getting worse underneath. I write about what we're seeing across 100+ engineering engagements in a weekly breakdown, [click here](https://thefoundation.limestonedigital.com/p/not-every-codebase-deserves-loops) if you want to read more on this topic. Anyway I'm not saying don't use loops. I'm saying before you point one at an existing codebase, figure out what's in there that you wouldn't want it to learn from. Because it will learn from all of it. It doesn't have opinions about which is which.
The article discusses common failure patterns in agentic AI systems, specifically 'dumb AI loops,' citing issues like state poisoning and data leaks observed in Claude Code deployments.
The article argues that the main challenge for AI coding agents is not code generation quality but whether users can build reliable loops around them, shifting the barrier to adoption from model capability to workflow design.
The article argues that AI coding tools are generating hidden technical debt in enterprise codebases by ignoring established organizational conventions, a problem that requires better context awareness rather than just improved model quality.
Loop engineering is the practice of designing systems where AI agents autonomously decide what to work on, execute, and iterate, going beyond manual prompting by building outer loops that compound across different domains. The article explains the two-layer agent harness and how sharing artifacts between loops creates compounding learning.
The article discusses how the AI engineering loop can be fully automated but argues that handing over the entire loop produces 'agent slop' due to imperfect evals. It recommends automating certain steps while keeping human judgment for nuance.