We keep adding “skills” to our agents and have no idea which ones actually work. Solved problem?
Summary
A PM at an internal developer platform highlights the challenge of tracking which AI agent skills are actually invoked and effective, and asks the community if there are existing tools or solutions for this observability problem.
Similar Articles
How does your company measure the impact of agents and skills in real production, not just benchmarks?
A discussion on how companies should measure the real-world impact of AI agents and skills in production environments, rather than relying solely on benchmark results.
everyone's focused on whether their agent works. almost nobody asks if it's actually getting better over time
The article points out a common oversight in AI agent development: while most teams monitor task completion, few systems capture and feed failure patterns back into future runs to enable learning and improvement over time.
Which platform is your company using for ai agent observability and reliability needs?
A developer building multi-agent financial workflows seeks community advice on observability and reliability tooling for AI agents in production, sharing frustration with fragmented landscape and cascading failures.
Most of our “agent” problems turned out to be workflow/state problems
A developer recounts how many challenges in building AI agents actually stem from workflow and state management issues, not model intelligence, emphasizing the need for robust state handling and observability.
What's the most useful AI agent you've seen in production?
A discussion about the most useful AI agents actually deployed in production, highlighting simple, single-problem solutions like lead qualification and support triage.