Tag
Latitude is an open-source AI Agent Monitoring tool that provides issue detection, traces, and evals for LLM-based agents, similar to Sentry for AI.
A tweet highlighting Latitude, an open-source agent observability tool that helps visualize AI agent actions and token usage, with the ability to catch and fix recurring failures directly from the editor.
LangChain and Fireworks fine-tuned a Qwen model to detect 'Perceived Error' from agent traces, achieving 100x cost reduction while maintaining frontier performance. The judge model is designed to enrich traces with error signals for monitoring agentic systems.
LangSmith Observability provides real-time monitoring for AI agents to help identify performance issues quickly.
agent-pd is an open-source logging and monitoring tool for Claude Code agents that records all tool and permission events and replays them through deterministic detectors to catch rule violations, without blocking any actions.
Adrian is an open-source AI agent runtime security monitoring engine that detects anomalies by jointly analyzing the agent’s behavior logs and reasoning chain, improving accuracy by 35% over behavior-only checks. It supports LangChain integration with a two-line SDK.
A logging hook and CLI tool that records all tool calls and permission events from Claude Code agents into a session log, then replays the log to audit for misbehavior like unauthorized file reads or permission escalation. It is a catch-and-report flight recorder, not a blocker.
Braintrust's Topics feature uses LLM summarization to make production agent traces tractable for clustering and classification at scale, inspired by Anthropic's Clio approach.
Proposes Agent-ToM, a learning-to-monitor framework using Theory-of-Mind reasoning to detect covert malicious behavior in autonomous LLM agents by inferring beliefs and intents, outperforming baseline monitors.
An open-source tool designed to detect silent coordination failures in agent systems, such as infinite loops and traffic spikes, with future plans for FinOps features to track costs and prevent budget overruns.
Dunetrace, an open-source real-time monitoring tool for production AI agents, updates with cross-agent pattern analysis, Langfuse deep analysis integration, and custom agent support.