@omarsar0: Great tips. In practice, this is how it roughly looks to run agents autonomously for hours or days. /goal or /loop to k…
Summary
A thread sharing practical tips for running AI agents autonomously for extended periods, focusing on the Opus model with advice on permissions, dynamic workflows, and verification.
View Cached Full Text
Cached at: 06/08/26, 09:28 PM
Great tips.
In practice, this is how it roughly looks to run agents autonomously for hours or days.
/goal or /loop to keep it going.
Verification is crucial here.
We are going to go deep into this in the next couple of weeks in the academy.
Join us: https://academy.dair.ai/events
I wanted to add an extra note to this, as there is a bit too much hype on the agent loops stuff. This works great for maintaining codebases and things that can be verified easily (in other words, where you can set clear conditions that the agent can meet). However, for a lot of other domains and real-world use cases, human in the loop is necessary. So I think a better approach is designing loops that support autonomous runs while enabling easy human collaboration and inputs. That’s what I have been working on recently, and I think it’s important to combat AI slop, which I am afraid these naive loops will generate purely from an AI model’s capability and lack of knowledge and world understanding.
The current “loop” conversation is mostly assuming conditions that can be automatically verified. Human-verified can be baked in as well, like an escalation trigger, but from my experience with coding agents on a loop, they aggressively escalate to humans if that option is there, so this has to be set up carefully.
Exactly. Lots of sloppy code getting generated for sure.
Similar Articles
@omarsar0: As we target more complex use of coding agents (e.g., dynamic workflows and /goals) on long-horizon tasks, you will sta…
Discusses challenges with coding agents in complex long-horizon tasks, highlighting bizarre user experience issues and inefficient agent interactions, and advocates for more control over the agent harness.
@bcherny: Seeing a number of benchmarks showing Opus is the best model for long-running work. Five tips for running Opus autonomo…
Practical tips for running Anthropic's Claude Opus autonomously for hours or days, such as using auto mode, dynamic workflows, and self-verification; also references the SWE-Marathon benchmark for long-horizon software tasks.
How do you make agents run for hours, and what architectures are actually agent-friendly?#deep-dive #vibe-coder-issues
The author explores two key challenges for AI coding agents: ensuring long-duration autonomous execution (hours) and designing agent-friendly architectures for local applications. They propose an explicit knowledge organization stage to manage messy context before planning and execution.
@googledevs: Most agent tutorials stop at stateless agents. Real workflows run for weeks. Build long-running AI agents that pause fo…
A tutorial from Google on building long-running AI agents that can pause for days, survive restarts, and resume without losing context using the Agent Development Kit (ADK), with code and step-by-step guidance for enterprise workflows like new hire onboarding.
@omarsar0: Very good advice on self-improving agents. (bookmark it) This is something I am seeing in my own experiments with codin…
Tweet discussing advice on self-improving agents, with personal observations from experiments on coding agents for long-horizon tasks, noting that stronger models don't always yield better agents.