Unpopular opinion: most production AI agents are flying blind and their developers don't know it

Reddit r/AI_Agents 07/04/26, 09:45 PM News

ai-agents observability production langchain monitoring cost-tracking

Summary

A developer argues that most production AI agents lack essential observability like session traces and cost tracking, comparing it to deploying a web app without monitoring. The article questions whether agent observability is an unsolved problem.

Talked to several dev agencies building LangChain/LangGraph agents for clients lately, plus seen a lot more in threads here and on r/LangChain. A pattern keeps showing up: zero production observability. No session traces. No per-session cost tracking. No alerting when the agent starts behaving differently. The usual answer: "we check the OpenAI dashboard" or "our client would tell us if something was wrong." This is insane to me. We wouldn't deploy a web app without Sentry and uptime monitoring. But somehow AI agents — which are way more unpredictable — get deployed with nothing. Is this just early days and everyone knows it? Or is observability for agents genuinely an unsolved problem? Curious what production setups actually look like at companies doing this seriously.

Original Article

Similar Articles

Something I keep seeing with AI projects that nobody talks about openly

Reddit r/AI_Agents

This article highlights that many AI agent projects fail in production not because of model quality, but because teams launch without clearly defining what constitutes failure, missing critical edge cases that lead to confident incorrect outputs.

Quick question for anyone running AI agents in production

Reddit r/AI_Agents

A question highlighting the lack of observability in AI agent memory layers, asking how teams debug incorrect retrievals without full traceability.

I analyzed how 50+ AI teams debug production agent failures and got surprised

Reddit r/AI_Agents

Based on interviews with 50+ AI teams, the author highlights that production agent failures often stem from minor prompt or configuration issues rather than deep model problems. The article advocates for adopting software engineering practices like versioning, A/B testing, and experiment tracking to improve reliability.

Most agent observability feels like crash footage

Reddit r/AI_Agents

The author argues that current agent observability provides a trace of actions but lacks runtime justification for why actions were permitted, which is critical for production deployments involving money, data, or communications.

The Real Truth About AI Agents

Reddit r/AI_Agents

An experienced practitioner shares hard-won lessons from deploying 25+ AI agents to production, arguing that memory, orchestration, and auditability matter far more than model choice. The article details common failure modes like context loss and silent cost loops, and recommends a stack including Claude Sonnet 4, Pydantic AI, and dedicated memory layers like Octopodas.

Similar Articles

Something I keep seeing with AI projects that nobody talks about openly

Quick question for anyone running AI agents in production

I analyzed how 50+ AI teams debug production agent failures and got surprised

Most agent observability feels like crash footage

The Real Truth About AI Agents

Submit Feedback