The author argues against building fully autonomous email agents based on real-world failures, advocating for constrained 'propose-and-approve' workflows where AI prepares context and drafts but humans retain final approval on sends.
Every week a founder messages me wanting an "AI that runs my inbox." Every week I end up talking most of them out of the autonomous version and into something far more boring that actually works. I build AI workflows for founders and small teams. Thirty-odd of these now. The pattern is so consistent I can call the conversation before it starts. They come in wanting the dream. They saw the demo where someone's "AI chief of staff" triages, replies, books meetings, and clears the inbox to zero while they sleep. They want that. Then we actually look at their email for ten minutes and I'm explaining why what they need is an assistant that drafts and proposes while they still hit send. You can watch the disappointment land in real time. Here's what's actually happening. Most "autonomous inbox agents" shipping right now are one bad reply away from torching a customer relationship the owner spent two years building. The autonomy is the part that demos well and the part that gets ripped out by month two. What survives in real businesses is the constrained version: the AI sees everything, prepares everything, decides nothing irreversible on its own. Three examples from the last few months. Solo founder, B2B. Wanted an agent that "just answers my email." What she needed was something that drafts every reply with the calendar and the prior thread already pulled in, queued for one-click approval. Same time saved. Zero chance of it promising a customer a refund she never approved. She still uses it daily. Agency owner. Wanted a "fully autonomous scheduling agent." What he needed was a thing that proposes meeting times that don't collide and writes the email — he sends. We didn't build an agent. We removed the three-tab dance. He stopped losing an hour a day to calendar tetris. Two-person startup. Wanted "AI that manages all comms." What they needed was pre-meeting prep: who is this, what did we last say, what's on the calendar, in one place before the call. No autonomy at all. It's the feature they'd now refuse to give up. None of these are autonomous agents. Every one of them beats the agent the founder originally asked for, because the agent would have confidently sent something wrong in week three and the trust never comes back. Why autonomous inbox agents keep failing in production Email is irreversible and adversarial. A sent message can't be unsent, and the cost of one hallucinated commitment to a customer is not symmetric with the time saved on the other 200. A good assistant has a human at exactly one checkpoint — the send. An autonomous agent removes the one checkpoint that actually mattered. Beautiful in a demo. Catastrophic the first time a customer phrases something weird at 2am. The people quietly winning with AI in their inbox right now aren't running autonomous agents. They wired a model into their actual mail and calendar — over MCP, usually, so it can see the real context instead of guessing — and kept themselves in the loop on anything that leaves the building. Tools like Superhuman's AI, Claude connected to mail over MCP, the Slashy MCP, even the native assistants eg Slashy , Superhuman , Fyxer etc the boring constrained setups are the ones still running on a Tuesday. In anything regulated or client-facing, full autonomy is doubly cursed. The first question anyone serious asks is "what can it send without you?" "Nothing without approval" ends the conversation in your favor. "It decides" turns it into a liability review. How to actually decide Before you pay anyone to build an autonomous inbox agent, answer these on paper: Is every outbound action reversible? If no, you want propose-and-approve, not autonomy. Can a wrong message cost you a customer or a contract? If yes, keep the human on send. Full stop. Do you actually need it to act, or do you need it to prepare? Most people need preparation — context assembled, draft written — not autonomy. Will anyone ever audit what it sent? If yes, you want a system where every action had a human checkpoint. If you're a builder: you'll make more money in the next year shipping honest assistants that draft-and-wait than chasing the "fully autonomous AI employee" headline. The first wave got burned and they're warning the next one. Be the person whose thing still works on Thursday because it never had the authority to break anything. Operators, builders, anyone with an AI touching real email — what's actually working? What blew up? Genuinely want the war stories.
The author argues that most founders requesting AI agents actually need straightforward automations with minimal LLM integration, citing production failures, compliance hurdles, and higher ROI from simpler workflows. The piece provides a practical decision framework to help builders and founders prioritize reliable automations over complex, unpredictable agents.
The author argues that autonomous AI agents are overrated without structured business context and scoped jobs, sharing practical insights from client work where agents run on fixed cadences with human oversight on writes.
The author argues that the real danger of AI agents is not their errors but their ability to perform final actions autonomously, suggesting that agents should stop one step earlier and leave the final click to humans or narrow workflows.
The author argues that in enterprise AI agent development, operational reliability and stability are more critical than high autonomy, advocating for controlled intelligence over fully autonomous systems.
A reflection on why AI agents often fail to gain adoption: they force users to switch contexts, creating friction that outweighs perceived value. The author suggests designing agents to integrate directly into existing workflows.