Signals: finding the most informative agent traces without LLM judges [R]

Reddit r/MachineLearning 05/10/26, 05:26 PM Papers

Summary

Katanemo Labs introduces 'Signals,' a lightweight method for identifying informative agent traces without using LLM judges or GPUs, achieving higher efficiency in trajectory analysis.

Hello Peeps Salman, Shuguang and Adil here from Katanemo Labs (a DigitalOcean company). Wanted to introduce our latest research on agentic systems called Signals. If you've been building agents, you've probably noticed that there are far too many agent traces/trajectories to review one by one, and using humans or extra LLM calls to inspect all of them gets expensive really fast. The paper proposes a lightweight way to compute structured “signals” from live agent interactions so you can surface the trajectories most worth looking at, without changing the agent’s online behavior. Computing Signals doesn't require a GPU. Signals are grouped into a simple taxonomy across interaction, execution, and environment patterns, including things like misalignment, stagnation, disengagement, failure, looping, and exhaustion. In an annotation study on τ-bench, signal-based sampling reached an 82% informativeness rate versus 54% for random sampling, which translated to a 1.52x efficiency gain per informative trajectory. Paper: arXiv 2604.00356. [https://arxiv.org/abs/2604.00356](https://arxiv.org/abs/2604.00356) Project where Signals are already implemented: [https://github.com/katanemo/plano](https://github.com/katanemo/plano) Happy to answer questions on the taxonomy, implementation details, or where this breaks down.

Original Article

Similar Articles

Signals: Trajectory Sampling and Triage for Agentic Interactions

Papers with Code Trending

This paper proposes a lightweight, signal-based framework for efficiently triaging agentic interaction trajectories by computing low-cost indicators that identify informative samples without impacting online agent behavior, achieving an 82% informativeness rate on benchmarks.

Insights Generator: Systematic Corpus-Level Trace Diagnostics for LLM Agents

arXiv cs.AI

This paper introduces the Insights Generator, a multi-agent system for systematic corpus-level trace diagnostics of LLM agents, which generates evidence-backed insights by proposing and testing hypotheses across execution traces. Experiments show that using Insights Generator reports improves scaffold performance by 30.4 percentage points.

TRACE: Trajectory Reasoning through Adaptive Cross-Step Evidence Aggregation for LLM Agents

arXiv cs.CL

TRACE is a monitoring framework for long-horizon LLM agent trajectories that uses a Triage-Inspect-Judge loop to connect evidence across temporally distant actions, achieving high recall and F1 on evasive sabotage detection tasks.

TRACE: Trajectory Risk-Aware Compression for Long-Horizon Agent Safety

arXiv cs.AI

This paper proposes TRACE, a trajectory-level safety detection method for long-horizon LLM agents that compresses full trajectory evidence into a latent state to better aggregate dispersed risk signals, achieving state-of-the-art accuracy on multiple benchmarks.

@Vtrivedy10: there's a very exciting future agent recipe for building intelligence too cheap to meter, applied towards extracting si…

X AI KOLs Following

The post outlines a future agent recipe for building scalable intelligence by fine-tuning efficient, specialized open models to surpass frontier performance on LLM-as-a-judge tasks, and applying this to extract signals from trace data for continual learning. LangChain Labs and FireworksAI release new work demonstrating this approach.