@omarsar0: How to effectively run autonomous long-running coding agents? This is one of the most exciting discussions on agents I'…

X AI KOLs Following 06/12/26, 05:50 PM Events

Summary

A recorded discussion about effectively running autonomous long-running coding agents, including insights on goal setting, model selection, and best practices, made freely available.

How to effectively run autonomous long-running coding agents? This is one of the most exciting discussions on agents I've ever had. I recorded it and am making it freely available. (bookmark it) The idea of autonomous long-running agents is a real thing. We talk about lots of things like /goal, /loop, and dynamic workflows, and what comes next. One interesting discussion was around how to make the agent run for longer while ensuring it stays on track. Most models today will struggle to coordinate work effectively. They sometimes pause the work early. Lots of mistakes happen, and lots of weird shortcuts (reward hacking). What helps is to be extremely clear about the goals it needs to achieve. To clarify the dos and don'ts clearly. Eliminate any assumptions you think the model would make. Deep expertise matters so much in this. But you can get far through careful planning. My formula currently is to use Opus 4.8 for planning carefully and GPT-5.5 for all executions. For the evaluator (via /goal), I am often using something like Deepseek or the latest models from Qwen, Kimi, and MiniMax, etc. Another insight we discussed to enforce goals is to provide strong visual cues for the agent to compare with. I found that a multimodal goal is a much stronger goal than a plain text one. And use agents to help you set clear goals. Watch here: https://academy.dair.ai/events/cmplo7v3b000e04l1pxprat4d…

Original Article

View Cached Full Text

Cached at: 06/12/26, 09:02 PM

How to effectively run autonomous long-running coding agents?

This is one of the most exciting discussions on agents I’ve ever had.

I recorded it and am making it freely available.

(bookmark it)

The idea of autonomous long-running agents is a real thing.

We talk about lots of things like /goal, /loop, and dynamic workflows, and what comes next.

One interesting discussion was around how to make the agent run for longer while ensuring it stays on track.

Most models today will struggle to coordinate work effectively. They sometimes pause the work early. Lots of mistakes happen, and lots of weird shortcuts (reward hacking).

What helps is to be extremely clear about the goals it needs to achieve. To clarify the dos and don’ts clearly. Eliminate any assumptions you think the model would make. Deep expertise matters so much in this.

But you can get far through careful planning. My formula currently is to use Opus 4.8 for planning carefully and GPT-5.5 for all executions. For the evaluator (via /goal), I am often using something like Deepseek or the latest models from Qwen, Kimi, and MiniMax, etc.

Another insight we discussed to enforce goals is to provide strong visual cues for the agent to compare with. I found that a multimodal goal is a much stronger goal than a plain text one. And use agents to help you set clear goals.

Watch here: https://academy.dair.ai/events/cmplo7v3b000e04l1pxprat4d…

Autonomous Long-Running Coding Agents

Source: https://academy.dair.ai/events/cmplo7v3b000e04l1pxprat4d Free EventEvent Ended

One of the areas where I am spending the most time today is build with long-running agents. You may have used /goal in coding agents like Claude Code, Hermes Agent, Codex, or Cursor. That’s one way to do it but long-running agents is a broader topic and I have been testing and doing research around it for the past couple of months. If you want to deep dive into this, including tips, use cases, and best practices, I will be doing a live session on the topic. Join us live!

What You’ll Get

Recording available after

About This Event

One of the areas where I am spending the most time today is build with long-running agents.

You may have used /goal in coding agents like Claude Code, Hermes Agent, Codex, or Cursor. That’s one way to do it but long-running agents is a broader topic and I have been testing and doing research around it for the past couple of months.

If you want to deep dive into this, including tips, use cases, and best practices, I will be doing a live session on the topic.

Join us live!

Go Deeper With A Course

https://academy.dair.ai/courses/build-apps-with-claude-code

@omarsar0: How to effectively run autonomous long-running coding agents? This is one of the most exciting discussions on agents I'…

Autonomous Long-Running Coding Agents

What You’ll Get

About This Event

Go Deeper With A Course

Similar Articles

@omarsar0: As we target more complex use of coding agents (e.g., dynamic workflows and /goals) on long-horizon tasks, you will sta…

@omarsar0: Great tips. In practice, this is how it roughly looks to run agents autonomously for hours or days. /goal or /loop to k…

@omarsar0: Very good advice on self-improving agents. (bookmark it) This is something I am seeing in my own experiments with codin…

How do you make agents run for hours, and what architectures are actually agent-friendly?#deep-dive #vibe-coder-issues

@techwith_ram: https://x.com/techwith_ram/status/2064925285003542820

Submit Feedback

Similar Articles

@omarsar0: As we target more complex use of coding agents (e.g., dynamic workflows and /goals) on long-horizon tasks, you will sta…

@omarsar0: Great tips. In practice, this is how it roughly looks to run agents autonomously for hours or days. /goal or /loop to k…

@omarsar0: Very good advice on self-improving agents. (bookmark it) This is something I am seeing in my own experiments with codin…

How do you make agents run for hours, and what architectures are actually agent-friendly?#deep-dive #vibe-coder-issues

@techwith_ram: https://x.com/techwith_ram/status/2064925285003542820