@JiaZhihao: Introducing Motus Tracing: open-source observability for AI agents. Without traces, an agent is a black box that burns …
Summary
Motus Tracing is a fully open-source observability layer for AI agents that captures every model call, tool call, sandbox interaction, and error, providing a unified span model for local development and cloud deployment with zero setup cost.
View Cached Full Text
Cached at: 05/15/26, 09:08 PM
Introducing Motus Tracing: open-source observability for AI agents. Without traces, an agent is a black box that burns tokens. Yet most agent observability and tracing stacks today live behind accounts and subscription tiers. Motus Tracing is fully open source. Capture every model call, tool call, sandbox interaction, sub-agent action, retry, and error, for any agent framework. One unified interface from development to production. Same spans for debugging, evals, and Learning Agents. Blog: https://lithosai.com/blog/motus-agent-tracing… Code: http://github.com/lithos-ai/motus @lithos_ai
Open-Source Observability for AI Agents | LithosAI
Source: https://www.lithosai.com/blog/motus-agent-tracing ← Back to blogMay 15, 2026·5min read
Observability is non-negotiable for AI agents of any shape or size. Without observability, the agent is a black box that burns tokens. It picks tools, branches, retries, and rewrites its own plan, and it cannot be debugged, evaluated, or improved.
Motus agent tracing is our observability layer. It records every step of a run in detail: model invocations, tool calls, sandbox interactions, sub-agent actions, retries, errors, and task transitions. Those steps are exposed through a span model that shares the same core across local development and cloud deployment. It works with any framework, whether Motus-native agents or agents built on the OpenAI Agents SDK, the Anthropic SDK, the Google ADK, or plain Python.The entire tracing stack isfully open sourceand free to use.
The design is shaped by three constraints. First, instrumentation must impose no setup cost: anyone building agents should be able to inspect a run the moment a problem appears, without any agent modifications. Second, the trace produced locally and the trace emitted from a deployed agent must share the same schema, so the analysis tooling, the debugging workflow, and the downstream evaluation pipeline all operate on one representation. Third, telemetry is not only for debugging but for improvement. The same spans that surface a failure should feed your favorite coding agent when iterating on the agent, or power more advanced techniques likeLearning Agents.
Local Tracing
Local tracing ships as part of the open-sourcemotusPython package. It is enabled with a single environment variable:
MOTUS_TRACING=1 uv run python agent.py
The runtime writes a self-containedtrace\_viewer\.htmlfile to disk. Opening it renders the complete span tree for the run, including model calls, tool calls, durations, token counts, and payload contents. No server, network connection, or account is required.
The local Motus trace viewer renders the full span tree for a single run, including model calls, tool calls, timing, token usage, and payload details.For interactive sessions in which spans should appear as the agent executes, the runtime exposes a streaming variant:
MOTUS_TRACING_ONLINE=1 uv run python agent.py
This starts a local viewer backed by Server-Sent Events. Spans are pushed to the browser as they close, allowing the developer to inspect partial state, identify hangs, and observe errors without waiting for the run to terminate.
Cloud Telemetry
The same agent source runs locally undermotus serveand in Motus Cloud undermotus deploy:
motus deploy --name support-agent agent:agent
In the deployed environment, the runtime emits traces continuously and tags each one with project, build, and session identifiers. There is no second SDK to integrate and no conditional code path for production. The same span emission that producestrace\_viewer\.htmllocally produces the cloud trace stream, ensuring that an engineer comparing a local reproduction to a production failure is comparing structurally identical records.
Inspecting a deployed agent run in the Motus Cloud trace viewer.The local and cloud boundaries are kept strictly separate. Traces generated bymotus serveor by direct invocation remain on the developer’s machine even when the user is authenticated; cloud upload occurs only inside the deployed cloud runtime. This separation is deliberate: development traces frequently contain unredacted prompts, synthetic test data, and incomplete experiments that should not be transmitted off-host.
The Span Model
The trace viewer is organized around a hierarchical span tree. Each span belongs to one of three classes:
- Model spansrecord the input messages, the generated output, the tools made available, token usage, and inference cost.
- Tool spansrecord arguments, return values, raised exceptions, and execution time.
- Task spansgroup model and tool spans into logical units of work, preserving the agent’s reasoning structure.
The viewer renders these spans as a timeline alongside the tree, so latency outliers and error sites are visible without inspecting individual payloads. The same view answers the questions that dominate agent debugging: what the model received, which tool it selected, what arguments it passed, what was returned, where time was spent, and which step failed.
The Motus Cloud trace viewer presents the timeline, span tree, and per-span detail in a single view.## Framework Independence
Instrumentation is implemented at the runtime layer rather than inside any single agent framework. Agents written with the Motus-nativeReActAgentand@tooldecorators are supported, as are agents built on the OpenAI Agents SDK, the Anthropic SDK, the Google ADK, or plain Python. Once an agent is served by Motus, the operational surface—sessions, traces, spans, deploys, and the cloud UI—is identical regardless of the upstream framework.
Traces as the Substrate for Learning Agents
ForLearning Agents, traces provide the primary feedback signal. The optimizer relies on production traces to identify failure modes, attribute cost, sample representative inputs, and quantify behavioral differences between candidate versions.
Because the trace schema is invariant across environments, the same spans an engineer inspects in the viewer become inputs to regression sets, error-driven datasets, traffic-sampled datasets, and cost-focused datasets. Debugging an individual run and improving the next deployed version draw from a single source of evidence.
Open Source, Available to Everyone
Motus tracing is open source under a permissive license. The Python runtime instrumentation, the span schema, and the local trace viewer all live in thelithos-ai/motusrepository. Installing the package is sufficient to capture, render, and stream traces for any agent, on any machine, with no account and no network dependency. Self-hosted deployments running undermotus serveretain the same tracing surface as the managed cloud; the cloud adds hosted retention, multi-session indexing, and the integration with Learning Agents, but the underlying observability is available to every developer running the open-source runtime.
Similar Articles
Tracea
Tracea is a new product offering Datadog-like observability for AI agents, providing features such as tracing, root cause analysis, and team memory.
@martin_casado: This tackles a very hard, very important problem in AI systems. Basically how do you expose your traces at scale to age…
A tweet by Martin Casado highlighting a solution to the difficult problem of exposing traces at scale to AI agents, balancing cost and AI leverage.
@benhylak: we built the first sane way to debug your agent locally. you can see your traces. codex/claude code can too. this lets …
A new open source tool enables local debugging of AI agents by viewing traces, allowing automated eval writing and testing with tools like codex and Claude code.
@zxlzr: Introducing MemTrace: Making LLM Memory Systems Finally Debuggable Memory is becoming a core component of AI agents. Bu…
MemTrace is a new tool that makes LLM memory systems debuggable by tracing memory operations across multiple turns, addressing the black-box nature of current memory-augmented agents.
Agent Trace RFC
Agent Trace is an open specification for tracking AI-generated code in version-controlled codebases, defining a vendor-neutral format to record AI contributions alongside human authorship.