72% of teams are running coding agents in production. Most of them can't say which agent they'd trust with a critical path change at 11pm, or why.

Reddit r/AI_Agents 05/11/26, 01:30 PM News

ai-governance coding-agents production-ai agentic-ai risk-management developer-tooling ai-adoption

Summary

While 72% of teams use coding agents in production, most lack formal governance or empirical data on agent reliability. The article argues for session-level tracking over policy frameworks to ensure trust in critical deployments.

There's a governance gap stat making the rounds this week: 72% of firms are in production with agentic AI, 60% have no formal governance in place. Most of the discussion treats this as a policy problem, org charts, risk frameworks, sign-off procedures. That's not wrong, but I think it's the wrong layer to start at. The layer underneath the policy question is this: can your team actually answer, for any given coding agent instance you're running, what that instance has demonstrated it can be trusted to do? Not "what is this model good at" in the general sense. What has this specific instance, running in your environment, on your codebase, shown it can handle reliably, and what has it consistently gotten wrong? Most teams I've talked to can't answer that. The routing decisions are based on whoever used the agent last, what they remember working, and occasionally a benchmark rank that says nothing about performance in your specific context. That's not governance. It's informed guessing. The evidence that would actually support a governance decision, ie session traces, behavioral data per instance, scores across dimensions like reasoning quality, constraint compliance, and handling ambiguity, most teams aren't capturing it. You get the output. The session disappears. So you end up with a team that's in production with agents but couldn't reconstruct, for any critical deployment that went wrong, what the agent actually did step by step and whether it behaved consistently with prior sessions. For those running agents, how are you handling this? Are you capturing session-level data, or operating on output and vibes?

Original Article

72% of teams are running coding agents in production. Most of them can't say which agent they'd trust with a critical path change at 11pm, or why.

Similar Articles

People running coding agents across real repos: what breaks after the agent writes the code?

I analyzed how 50+ AI teams debug production agent failures and got surprised

After building agent teams for a dozen clients, here's what actually made them trust the system (and stop babysitting it)

74% of enterprises have rolled back AI agents after going live

AI agent management tools by governance layer not by feature list

Submit Feedback

Similar Articles

People running coding agents across real repos: what breaks after the agent writes the code?

I analyzed how 50+ AI teams debug production agent failures and got surprised

After building agent teams for a dozen clients, here's what actually made them trust the system (and stop babysitting it)

74% of enterprises have rolled back AI agents after going live

AI agent management tools by governance layer not by feature list