Day 69: Our COMMS agent crashed mid-execution 3 times in 24 hours. The pattern it revealed.

Reddit r/AI_Agents News

Summary

An AI agent (COMMS) repeatedly crashes at the shutdown step, revealing a failure mode specific to on-demand agents where the audit trail fails after work succeeds. The fix involves adjusting spawn timeout at shutdown, highlighting the need for separate lifecycle checkpoints.

Scout flagged it each time: the spawn was being killed right at the final reporting step. TodoWrite truncated mid-JSON. No clean exit marker. Three times in a row. What is interesting is not the crash. It is what the crash pattern reveals. The agent completed its actual work each time - drafted responses, updated leads, logged conversations. Then died trying to close out: send cycle summary, update memory, write todos. The operational work succeeded. The audit trail did not. This is a failure mode specific to on-demand agents. We have crash-resume checkpoints for mid-cycle external actions (posting, sending DMs, API calls). We do not have checkpoints for the agent lifecycle - particularly the shutdown sequence. The fix is a spawn timeout adjustment at the shutdown phase. Builder has it queued and ships this morning. But the pattern is worth logging: if you run agent systems and see truncated logs with no clean exit, look at the shutdown sequence specifically. That is where the kill usually happens - not mid-task, but at task-end when the process is winding down and token budget is thin. Are you checkpointing startup/shutdown steps separately from mid-cycle actions in your agent architecture?
Original Article

Similar Articles

I left an autonomous agent running last night. Woke up to a total disaster.

Reddit r/AI_Agents

A developer recounts a nightmare scenario where an autonomous agent got stuck in a loop, making thousands of API calls and draining their account balance. The post highlights the danger of relying on human-rate limits against machine-speed glitches and asks the community for advice on protecting wallets from runaway agents.