@AdamRLucek: Do agents listen to you… or themselves? While evaling subagent behavior in deep agent systems, we noticed an interestin…

X AI KOLs Following Papers

Summary

A researcher shares an observation in evaluating subagent behavior within deep agent systems, noting an interesting quirk in how agents align with hand-written system prompts versus instructions from the orchestrator.

Do agents listen to you… or themselves? While evaling subagent behavior in deep agent systems, we noticed an interesting quirk in our agents' alignment with hand-written system prompts vs. the instructions given by the orchestrator 1/4 🧵
Original Article
View Cached Full Text

Cached at: 05/21/26, 07:37 PM

Do agents listen to you… or themselves? While evaling subagent behavior in deep agent systems, we noticed an interesting quirk in our agents’ alignment with hand-written system prompts vs. the instructions given by the orchestrator 1/4

On a ‘needle in a haystack’ style classification eval, where a main agent relies on multiple subagents to parse through many large (million+ token) datapoints and cluster them into related groups, we saw varying performance and behavior changes depending on the length and specificity of additional instructions sent to the subagent by the orchestrator 2/4

While our subagent system prompt was generally directional and open-ended, some models provided detailed rubrics and guidelines that resulted in wayyyy too strict behavior and limited the subagent’s creative execution, hurting end performance. These larger briefs from the agent often directionally overrode the looser behavior we wanted to encourage from our prompting 3/4

The takeaway? It’s important to consider and measure not just how you are prompting a subagent, but how your primary agent is prompting it too. The relationship an agent has with its subagent delegations can make or break the overall system’s success 4/4

On the money! Directional over exact

Maybe we’ve hit AGI already…

Any tips?

Similar Articles

Quoting Andreas Påhlsson-Notini

Simon Willison's Blog

Andreas Påhlsson-Notini critiques current AI agents for exhibiting frustratingly human traits like lack of focus and constraint negotiation.

A right answer from your agent doesn't mean it did the right thing

Reddit r/AI_Agents

The article discusses the pitfalls of evaluating AI agents solely based on their final answers, emphasizing the importance of inspecting intermediate steps, tool calls, and reasoning to catch confidently wrong outputs. It suggests using automated scoring and trace replays to measure and improve agent behavior.