How are you actually deciding which agent actions need human approval before executing?

Reddit r/AI_Agents 06/09/26, 01:09 AM News

agent-architecture human-approval safety ai-agents guardrails incident

Summary

The article discusses the challenge of determining which AI agent actions require human approval, citing a $27M unauthorized transfer in January 2026, and proposes a framework based on reversibility and impact.

I've been thinking a lot about where approval gates belong in agent architectures, and I keep coming back to the same problem: most teams either gate too much (agent becomes unusable) or gate nothing and hope the model makes good decisions. In January 2026, an AI agent transferred $27M with no human approval gate at all. Not a jailbreak, not a prompt injection — the agent had the permissions and no gate existed. That's a design decision that went wrong. The framing I've landed on is two axes: reversibility and impact. High on both means gate before execution. Low on both means let it run. The hard cases are the diagonals — low reversibility but low impact, or high impact but easily reversed. But this still leaves open questions I don't have clean answers to: What do you do when the gate gets no response? Default to blocked, or default to proceed? I strongly believe it should fail closed, but I've seen teams argue the opposite for UX reasons. How do you handle cascading tool calls where one approved action triggers a second action that should also require approval? Does the first approval carry over? And at what dollar threshold does a financial action need a gate? $1K? $10K? Depends entirely on the use case but I haven't seen anyone publish a principled framework for this. Curious how others are drawing these lines in production. What criteria are you actually using?

Original Article

How are you actually deciding which agent actions need human approval before executing?

Similar Articles

Approval is not review if the human cannot inspect the action

Agent rules need to exist where the action happens

Human approval is not a weakness in AI agents

How are you all handling irreversible actions in production agents? I gave up on prompts and built an external risk gate.

AI agents are about to create a responsibility problem nobody wants to own

Submit Feedback

Similar Articles

Approval is not review if the human cannot inspect the action

Agent rules need to exist where the action happens

Human approval is not a weakness in AI agents

How are you all handling irreversible actions in production agents? I gave up on prompts and built an external risk gate.

AI agents are about to create a responsibility problem nobody wants to own