@ericzakariasson: https://x.com/ericzakariasson/status/2070493377267646797

X AI KOLs Timeline Tools

Summary

A practical guide on setting up iterative loops for AI coding agents with defined stop conditions, cloud execution, and notification channels to offload work without constant babysitting.

https://t.co/drBAUlV6B3
Original Article
View Cached Full Text

Cached at: 06/26/26, 02:12 PM

Human in the /loop

What I like most about coding with agents right now is the room to leave a few runs going and still get on with other work. When something finishes or needs a call, I show up.

This post is a short explainer of the setup I use, a definition of done the agent can score, a loop that keeps going until it should stop, pings so I know when to lean in.

Find something the agent can verify

Before kicking off a longer running task, I lock a definition of done. Examples I actually use:

  • Model or eval work. Target is a score. Change the approach, run the eval, keep the change only if the number moved the right way. Closest to Karpathy’s autoresearch for ML training loops.

  • Web app or UI. Target is a QA pass. Load the page or run Playwright, screenshot it, make sure it still does the thing.

  • Backend or refactor. Target is the test suite. Failing tests first, then green, and it has to stay green.

  • Speed or flakiness. Target is a number (p95, a benchmark). Change and measure until you’re under the line you set.

  • Data or content cleanup. Target is a count. Loop until zero rows fail validation, or every item passes the check.

Writing the loop is mostly writing how you’d check the work yourself. Some tasks need every step on the page. Others I give a goal and a rough direction and let the model fill in the middle. I start more explicit than I think I need, then loosen it once I see what it can infer.

Wrap it in a loop

Definition of done in hand, I tell the agent to loop on it. Change something, measure, keep or revert, go again. Doesn’t have to be one tiny edit each time. The step just has to be measurable against the target. I care most about the stop conditions, which might be

  • Metric hits the target

  • No improvement after a few tries

  • Out of ideas

  • Blocked or unsure (stop and ask)

Get pinged instead of babysitting

So the agent gets a notify path (MCP plus /notify) and reaches me there. Usually Slack, because that’s where everything else already is. Same setup could be iMessage or whatever. I treat it as a generic notification channel, not full Slack access for the agent. Status updates and “I need a decision” show up like normal messages. When I answer, that reply is the next thing the loop runs on.

Run it in the cloud

Most of this doesn’t stay on my laptop. It runs in the cloud so a loop can keep going for hours without my machine being open. I use my own client as the orchestrator and fan work out to cloud agents from there.

Then start the next one

Once a loop is running, I start another. Usually three or so, sometimes five. And that’s only the long loops. I often have other agents up at the same time on shorter work: a PR, a one-off investigation, something that isn’t a multi-hour hill climb. If things are quiet I fire off another. If three are waiting on me I stop starting stuff and go review.

Prompting

Rough template of how i prompt. /loop drives the iterations and /notify keeps me posted

Task:

/loop until <metric / tests / eval / QA check> hits . Treat it as the source of truth and don’t change it while you run.

Stop early if it stalls for several tries or you run out of ideas. If you’re blocked or unsure, stop and ask.

/notify on start, on anything surprising, and when you’re done or stuck. Ping me when you need a decision.

If you’re running loops, I’d love to hear how we can make it easier for you!

Similar Articles

@shmidtqq: https://x.com/shmidtqq/status/2068704187492221405

X AI KOLs Timeline

An in-depth guide to loop engineering for AI coding agents, explaining how to build automated loops that repeatedly prompt agents, verify results, and avoid runaway costs, illustrated with a case study of one engineer shipping 259 PRs in a month.