A Two-Dimensional Framework for AI Agent Design Patterns: Cognitive Function and Execution Topology

arXiv cs.AI Papers

Summary

This paper proposes a two-dimensional classification framework for AI agent design patterns that combines cognitive function and execution topology axes, identifying 27 named patterns and deriving empirical laws from cross-domain analysis.

arXiv:2605.13850v1 Announce Type: new Abstract: Existing frameworks for LLM-based agent architectures describe systems from a single perspective: industry guides (Anthropic, Google, LangChain) focus on execution topology -- how data flows -- while cognitive science surveys focus on cognitive function -- what the agent does. Neither axis alone disambiguates architecturally distinct systems: the same Orchestrator-Workers topology can implement Plan-and-Execute, Hierarchical Delegation, or Adversarial Verification -- three patterns with fundamentally different failure modes and design trade-offs. We propose a two-dimensional classification that combines (1) a Cognitive Function axis with seven categories (Context Engineering, Memory, Reasoning, Action, Reflection, Collaboration, Governance) and (2) an Execution Topology axis with six structural archetypes (Chain, Route, Parallel, Orchestrate, Loop, Hierarchy). The resulting 7x6 matrix identifies 27 named patterns, 13 with original names. We demonstrate orthogonality through systematic cross-axis analysis, define eight representative patterns in detail, and validate descriptive coverage across four real-world domains (financial lending, legal due diligence, network operations, healthcare triage). Cross-domain analysis yields five empirical laws of pattern selection governing the relationship between environmental constraints (time pressure, action authority, failure cost asymmetry, volume) and architectural choices. The framework provides a principled, framework-neutral, and model-agnostic vocabulary for AI agent architecture design.
Original Article
View Cached Full Text

Cached at: 05/15/26, 06:18 AM

# A Two-Dimensional Framework for AI Agent Design Patterns: Cognitive Function × Execution Topology
Source: [https://arxiv.org/html/2605.13850](https://arxiv.org/html/2605.13850)
Jia Huang1Joey Tianyi Zhou1,2 1Agency for Science, Technology and Research \(A\*STAR\), Singapore 2Centre for Frontier AI Research \(CFAR\), A\*STAR huang\_jia@a\-star\.edu\.sgjoey\_zhou@a\-star\.edu\.sg

\(March 2026\)

###### Abstract

Existing frameworks for LLM\-based agent architectures describe systems from a single perspective: industry guides \(Anthropic, Google, LangChain\) focus on*execution topology*—how data flows—while cognitive science surveys focus on*cognitive function*—what the agent does\. Neither axis alone disambiguates architecturally distinct systems: the same Orchestrator\-Workers topology can implement Plan\-and\-Execute, Hierarchical Delegation, or Adversarial Verification—three patterns with fundamentally different failure modes and design trade\-offs\.

We propose a two\-dimensional classification that combines \(1\) aCognitive Function axiswith seven categories \(Context Engineering, Memory, Reasoning, Action, Reflection, Collaboration, Governance\) and \(2\) anExecution Topology axiswith six structural archetypes \(Chain, Route, Parallel, Orchestrate, Loop, Hierarchy\)\. The resulting7×67\\times 6matrix identifies 27 named patterns, 13 with original names\. We demonstrate orthogonality through systematic cross\-axis analysis, define eight representative patterns in detail, and validate descriptive coverage across four real\-world domains \(financial lending, legal due diligence, network operations, healthcare triage\)\. Cross\-domain analysis yields five empirical laws of pattern selection governing the relationship between environmental constraints \(time pressure, action authority, failure cost asymmetry, volume\) and architectural choices\. The framework provides a principled, framework\-neutral, and model\-agnostic vocabulary for AI agent architecture design\.

Keywords:AI agents, design patterns, taxonomy, cognitive function, execution topology, software architecture, multi\-agent systems

## 1Introduction

The rapid deployment of LLM\-based agent systems has produced a fragmented landscape of architectural guidance\. Every major AI organization has published its own framework for understanding agent architectures:

- •Anthropic’s “Building Effective Agents”\[[1](https://arxiv.org/html/2605.13850#bib.bib1)\]identifies six execution topologies \(prompt chaining, routing, parallelization, orchestrator\-workers, evaluator\-optimizer, autonomous agents\)\.
- •Google’s Agent Development Kit\[[2](https://arxiv.org/html/2605.13850#bib.bib2)\]describes eight workflow patterns organized around sequential, parallel, and loop structures\.
- •LangChain’s multi\-agent guide\[[3](https://arxiv.org/html/2605.13850#bib.bib3)\]presents four coordination patterns \(supervisor, hierarchical, network, handoff\)\.
- •Andrew Ng’s agentic design patterns\[[4](https://arxiv.org/html/2605.13850#bib.bib4)\]identifies four cognitive capabilities \(reflection, tool use, planning, multi\-agent collaboration\)\.

The critical observation:all existing frameworks describe agent architectures from*only one axis*\. Industry sources focus on execution topology—*how*data flows\. Cognitive surveys\[[5](https://arxiv.org/html/2605.13850#bib.bib5),[6](https://arxiv.org/html/2605.13850#bib.bib6),[7](https://arxiv.org/html/2605.13850#bib.bib7)\]focus on functional capability—*what*the agent does\. Neither alone disambiguates architecturally distinct systems\.

Consider the Orchestrator\-Workers topology\. The same structural wiring diagram serves at least three fundamentally different patterns:

1. 1\.Plan\-and\-Execute\(Action\): a planner decomposes a task into subtasks and dispatches them to executor agents\.
2. 2\.Hierarchical Delegation\(Collaboration\): a manager obtains specialized expertise from domain\-specific sub\-agents\.
3. 3\.Observability Harness\(Governance\): a central monitor orchestrates logging, tracing, and alerting across agent modules\.

These are architecturally distinct systems with different failure modes, different scaling properties, and different testing strategies—yet they share a topology\. Without the cognitive function axis, they are indistinguishable\. Similarly, a single cognitive function can be realized by multiple topologies: Reasoning \(C3\) can be implemented as Chain\-of\-Thought \(Chain\), Complexity\-Based Routing \(Route\), Parallel Exploration \(Parallel\), or Iterative Hypothesis Testing \(Loop\)\. The choice of topology determines latency, cost, and failure characteristics\.

This paper makes four contributions:

1. 1\.Atwo\-dimensional classification frameworkthat combines cognitive function and execution topology into a single coordinate system \(Section[2](https://arxiv.org/html/2605.13850#S2)\)\.
2. 2\.Detailed definitionsof eight representative patterns—one per cognitive function plus a governance pattern—sufficient for independent understanding \(Section[3](https://arxiv.org/html/2605.13850#S3)\)\.
3. 3\.Asystematic orthogonality demonstrationshowing that neither axis reduces to the other \(Section[4](https://arxiv.org/html/2605.13850#S4)\)\.
4. 4\.Acoverage evaluationacross four real\-world domains that validates the framework’s descriptive power and yields five empirical laws of pattern selection \(Section[6](https://arxiv.org/html/2605.13850#S6)\)\.

## 2The Two\-Dimensional Framework

### 2\.1Design Principles

Our framework is guided by three principles:

Orthogonality\.The two axes must be independently variable\. A change in cognitive function should not necessitate a change in execution topology, and vice versa\.

Completeness\.The axes should cover all capabilities required for production agent systems and all structural archetypes sufficient to compose any agent workflow\.

Durability\.Categories describe*structural needs*and*structural forms*that persist across framework and model changes\. “Context Engineering” remains relevant whether the context window is 4K or 2M tokens; “Loop” remains relevant whether the loop body is a GPT\-4 call or a Claude call\.

### 2\.2Axis 1: Cognitive Function \(What\)

We identify seven cognitive function categories\. These are grounded in the cognitive science literature on language agents\[[7](https://arxiv.org/html/2605.13850#bib.bib7)\]and extended with two categories \(Governance, Context Engineering\) that emerged from production deployment analysis:

Table 1:Seven cognitive function categories\.The seven categories form a cognitive processing pipeline: an agent perceives input \(C1\), retrieves relevant knowledge \(C2\), reasons about what to do \(C3\), executes actions \(C4\), evaluates its outputs \(C5\), coordinates with other agents when needed \(C6\), and operates within governance constraints throughout \(C7\)\. This pipeline is not strictly sequential—agents cycle through Perception\-Reasoning\-Action loops repeatedly—but the categories are functionally distinct\.

### 2\.3Axis 2: Execution Topology \(How\)

We identify six execution topology archetypes\. These subsume the topologies described in existing industry frameworks\[[1](https://arxiv.org/html/2605.13850#bib.bib1),[2](https://arxiv.org/html/2605.13850#bib.bib2)\]:

Table 2:Six execution topology archetypes\.
### 2\.4The7×67\\times 6Pattern Matrix

The Cartesian product yields a7×6=427\\times 6=42cell matrix\. We identify 27 named patterns occupying these cells; the remaining 15 cells are either structurally redundant or not yet observed in practice\. Table[3](https://arxiv.org/html/2605.13850#S2.T3)shows the complete matrix\. Cells marked with★\\bigstarcarry original names coined in this work\.

Table 3:The7×67\\times 6pattern matrix\. 27 named patterns;★\\bigstar= original name coined in this work\.

## 3Representative Pattern Definitions

We define eight representative patterns—at least one per cognitive function row—in sufficient detail for independent understanding\. Each definition follows the format:*coordinates, problem, architectural solution, and engineering trade\-offs*\. The remaining 19 patterns follow the same template\.

### 3\.1Context Triage \(C1×\\timesT2: Context Engineering×\\timesRoute\)

Problem\.An agent at task start has access to many information sources: the user’s message, conversation history, project files, documentation, tool outputs, retrieved knowledge, and environmental metadata\. The context window cannot hold everything\. The naive approach—first\-in\-first\-out, truncate when full—fails because relevance is not correlated with recency\.

Solution\.Context Triage applies emergency\-room triage logic to information selection\. Every information source is classified by priority \(P0–P3\), and a routing function dispatches each source to the appropriate treatment: P0 \(always load\), P1 \(load if relevant\), P2 \(load on demand\), P3 \(never load\)\. The routing function evaluates source relevance against the current task description, token budget constraints, and cache\-friendliness\. Claude Code’s five\-level CLAUDE\.md hierarchy \(Enterprise→\\toUser→\\toProject→\\toRules→\\toLocal\) is a production implementation of this pattern\.

Trade\-offs\.Higher triage accuracy reduces context noise but increases routing latency\. Over\-aggressive filtering risks starving the agent of critical information; under\-filtering dilutes attention quality\[[12](https://arxiv.org/html/2605.13850#bib.bib12)\]\.

### 3\.2RAG Pipeline \(C2×\\timesT1: Memory×\\timesChain\)

Problem\.The agent needs knowledge beyond what fits in its context window or its training data\. The knowledge may be domain\-specific, frequently updated, or proprietary\.

Solution\.Retrieval\-Augmented Generation\[[13](https://arxiv.org/html/2605.13850#bib.bib13)\]implements an “open\-book exam”: query→\\toretrieve→\\torerank→\\togenerate\. The chain topology ensures each step’s output feeds the next\. The retrieval step converts the agent’s information need into a vector query against an external knowledge store; the reranking step filters and orders results by relevance; the generation step synthesizes an answer conditioned on retrieved evidence\. MemGPT\[[14](https://arxiv.org/html/2605.13850#bib.bib14)\]extends this with virtual context management—paging data between the context window and external storage using OS\-inspired memory management\.

Trade\-offs\.Each retrieved chunk consumes token budget\. Retrieving 10 chunks at 500 tokens each costs 5,000 tokens that cannot be spent on reasoning\. The architect must balance recall \(more chunks, more evidence\) against precision \(fewer chunks, more attention per chunk\)\.

### 3\.3Complexity\-Based Routing \(C3×\\timesT2: Reasoning×\\timesRoute\)

Problem\.Agent workloads contain queries of widely varying difficulty\. Applying deep Chain\-of\-Thought reasoning with 64K thinking tokens to “What are your store hours?” wastes compute; applying a quick heuristic to a multi\-step diagnostic task produces errors\.

Solution\.A lightweight classifier evaluates each incoming query and routes it to the appropriate reasoning depth: System 1 \(direct response,∼\\sim500 tokens\) for simple queries, System 2 \(Chain\-of\-Thought,∼\\sim8K tokens\) for moderate queries, and extended deliberation \(∼\\sim64K tokens\) for complex queries\. This mirrors Kahneman’s dual\-process theory\[[15](https://arxiv.org/html/2605.13850#bib.bib15)\]: the architecture decides*how deeply*to think before thinking\. RouteLLM\[[16](https://arxiv.org/html/2605.13850#bib.bib16)\]demonstrated 85% cost reduction with minimal quality loss by routing between strong and weak models\.

Trade\-offs\.Classifier accuracy determines system performance\. Misrouting a complex query to System 1 produces errors; misrouting a simple query to System 2 wastes tokens\. At 100,000 daily queries, the difference between $0\.0015 and $0\.19 per query is $18,850/day\.

### 3\.4Plan\-and\-Execute \(C4×\\timesT4: Action×\\timesOrchestrate\)

Problem\.A complex task requires multiple tool calls where the execution order depends on intermediate results\. A single\-step approach cannot decompose the task; a purely sequential chain is too rigid to handle dynamic dependencies\.

Solution\.Separate strategy from tactics: a*planner*agent decomposes the task into a directed acyclic graph \(DAG\) of subtasks, and an*executor*agent \(or pool of executors\) carries out each subtask\. The planner can use a cheaper model; the executors handle tool invocation with domain\-specific prompts\. This is the Saga pattern from distributed systems\[[17](https://arxiv.org/html/2605.13850#bib.bib17)\]adapted for agent workflows: each subtask is a compensatable action, and the planner manages the overall transaction\.

Trade\-offs\.Plan quality determines execution efficiency\. Over\-decomposition creates unnecessary coordination overhead; under\-decomposition creates subtasks too complex for executors\. The separation also introduces latency—planning before executing is slower than immediate execution for simple tasks\.

### 3\.5Generator\-Critic \(C5×\\timesT5: Reflection×\\timesLoop\)

Problem\.An agent’s first\-draft output is functional but imperfect\. The agent needs to improve this output through structured critique before delivery\.

Solution\.Separate generation from evaluation and iterate: generate→\\tocritique→\\torevise→\\tocritique→\\to…→\\toaccept\. The critical design choice is the*feedback source*\. Huang et al\.\[[18](https://arxiv.org/html/2605.13850#bib.bib18)\]\(ICLR 2024\) demonstrated that LLMs cannot reliably self\-correct without external feedback\. Three variants address this: \(1\) self\-critique using different prompts for generation and evaluation, \(2\) cross\-model critique where a separate model evaluates, and \(3\) tool\-grounded critique where a test suite, linter, or calculator provides deterministic feedback\. CRITIC\[[19](https://arxiv.org/html/2605.13850#bib.bib19)\]showed that tool\-interactive critiquing consistently outperforms pure self\-critique\. Self\-Refine\[[20](https://arxiv.org/html/2605.13850#bib.bib20)\]demonstrated∼\\sim20% absolute improvement across seven tasks through 2–4 iterations\.

Trade\-offs\.Each iteration costs tokens\. Convergence requires a stopping criterion—a quality threshold or iteration budget—to prevent over\-editing or oscillation\.

### 3\.6Fan\-Out/Gather \(C6×\\timesT3: Collaboration×\\timesParallel\)

Problem\.A task is decomposable into independent subtasks that can be processed simultaneously\. A single agent processing them sequentially would taken×n\\timesthe time\.

Solution\.A coordinator fans out subtasks tonnworker agents running in parallel, then gathers and aggregates their results\. Each worker operates in an isolated context window containing only its subtask\. The coordinator handles result synthesis, conflict resolution, and quality assessment\. Du et al\.\[[21](https://arxiv.org/html/2605.13850#bib.bib21)\]showed that multiagent debate among multiple LLM instances improves factuality and reasoning, with consensus\-based aggregation outperforming single\-agent baselines on arithmetic and strategic reasoning benchmarks\. However, naive aggregation without structured debate protocols can amplify errors when agents produce conflicting outputs\.

Trade\-offs\.nnworkers consumen×n\\timestokens\. The aggregation step is the quality bottleneck—naive concatenation of worker outputs produces incoherent results\. Worker independence must be genuine; interdependent subtasks force sequential execution regardless of topology\.

### 3\.7Approval Gate \(C7×\\timesT2: Governance×\\timesRoute\)

Problem\.An agent that acts on the real world faces a dilemma: too many approval prompts cause “approval fatigue” \(users click approve reflexively\), while too few allow irreversible damage\.

Solution\.Route every agent action through a three\-stage evaluation: \(1\)Denyrules \(absolute priority—block dangerous actions unconditionally\), \(2\)Allowrules \(auto\-approve low\-risk actions to reduce noise\), \(3\)Humangate \(residual—anything not denied or allowed reaches the human\)\. Actions are classified along two dimensions: reversibility \(can the action be undone?\) and impact \(how much damage if wrong?\)\. Claude Code implements this as a five\-tier permission system ranging fromdefault\(prompt for everything dangerous\) tobypassPermissions\(no gates\)\.

Trade\-offs\.Classifier accuracy is critical\. Under\-classification blocks safe actions and frustrates users; over\-classification allows dangerous actions through\. The key design variable is the granularity of the reversibility/impact classification: too coarse and safe actions are blocked; too fine and the classifier itself becomes a maintenance burden\.

### 3\.8Blast Radius Control \(C7×\\timesT6: Governance×\\timesHierarchy\)

Problem\.Even with approval gates, an agent may cause damage through unexpected tool interactions, cascading failures, or actions that are individually safe but collectively dangerous\.

Solution\.Nested containment hierarchies limit the maximum damage any single action can cause\. Each level constrains the child: process sandbox→\\tofilesystem isolation→\\tonetwork restrictions→\\toAPI rate limits→\\tobudget caps\. The hierarchy topology is essential: each containment layer can enforce different policies, and the outermost layer represents the organizational risk boundary\. Codex CLI’s sandbox enables a “full\-auto” mode precisely because the sandbox guarantees bounded damage\.

Trade\-offs\.Tighter containment reduces blast radius but also reduces agent capability\. Finding the minimum viable containment—the tightest sandbox that still permits the agent to accomplish its task—is the governance architect’s central challenge\.

## 4Orthogonality Demonstration

To validate that the two axes are genuinely independent, we show that \(a\) a single topology serves multiple cognitive functions, and \(b\) a single cognitive function is served by multiple topologies\.

### 4\.1Same Topology, Different Cognitive Functions

TheLooptopology \(T5\) serves at least four distinct cognitive purposes:

- •Failure Journal\(C2\): iteratively recording and consolidating error patterns from past executions\.
- •Iterative Hypothesis Testing\(C3\): alternating hypothesis generation with evidence gathering through environment interaction\.
- •ReAct Loop\(C4\): interleaving reasoning steps and tool execution\[[22](https://arxiv.org/html/2605.13850#bib.bib22)\]\.
- •Generator\-Critic\(C5\): iteratively generating and critiquing outputs until a quality threshold is met\.

All four share the samewhile\(\!done\)control structure\. They differ entirely in*what cognitive function*the loop serves: memory consolidation, hypothesis testing, tool use, or quality improvement\.

### 4\.2Same Cognitive Function, Different Topologies

Reasoning\(C3\) can be implemented via at least four topologies:

- •Chain\-of\-Thought\(T1 Chain\): linear step\-by\-step decomposition\[[23](https://arxiv.org/html/2605.13850#bib.bib23)\]\.
- •Complexity\-Based Routing\(T2 Route\): dispatching queries to different reasoning depths\.
- •Parallel Exploration\(T3 Parallel\): tree or graph search across multiple reasoning branches simultaneously\.
- •Iterative Hypothesis Testing\(T5 Loop\): probe\-observe\-adjust cycles where the agent reasons*through*environment interaction\.

The choice of topology determines latency \(Chain is fastest, Loop is slowest\), cost \(Parallel is most expensive\), and completeness \(Chain may miss alternatives that Parallel would find\)\. This confirms that neither axis reduces to the other: knowing the cognitive function does not determine the topology, and vice versa\.

## 5Comparison with Existing Frameworks

Table 4:Comparison with existing agent architecture resources\.No existing framework provides both axes simultaneously\. Liu et al\.\[[10](https://arxiv.org/html/2605.13850#bib.bib10)\]catalog 18 agent design patterns but organize them by a flat category system without orthogonal axes, making it impossible to distinguish patterns that share a topology but serve different cognitive functions\. Dao et al\.\[[11](https://arxiv.org/html/2605.13850#bib.bib11)\]use a system\-theoretic lens with five functional categories but do not cross them with execution topologies\. Our contribution is not the individual axes—both have precedents—but their*systematic combination*into a single coordinate system that enables unambiguous pattern identification\.

## 6Coverage Evaluation

To validate the framework’s descriptive power, we apply a six\-step Pattern Selection Methodology \(Bound→\\toMap→\\toTopology→\\toSelect→\\toImpact→\\toBuild\) to four real\-world domains\. Each domain was chosen to stress different aspects of the framework: different time constraints, different volumes, different risk profiles, and different governance requirements\.

### 6\.1Four Domain Case Studies

Table 5:Four case study domains with structurally different architectures derived from the same pattern catalog\.Financial Lending\.An agent assisting bank credit officers with SME loan assessment\. Seven patterns selected: Context Triage \(filter applicant documents by relevance\), RAG Pipeline \(retrieve regulatory rules and industry benchmarks\), Complexity\-Based Routing \(route simple applications to fast\-track, complex ones to deep analysis\), Iterative Hypothesis Testing \(probe financial statements for inconsistencies\), Generator\-Critic \(regulatory compliance review\), Approval Gate \(human officer makes final decision\), and Observability Harness \(audit logging for regulatory compliance\)\. The 4\-hour budget permits the Orchestrate topology with deep per\-pattern execution\.

Legal Due Diligence\.An agent reviewing 500 contracts for M&A due diligence\. Eight patterns selected, notably adding Fan\-Out/Gather \(parallel contract processing\) and Hierarchical Delegation \(partner→\\toassociate→\\toclause\-level review\) to handle volume\. The Hierarchy topology is driven by Law 4 \(volume determines collaboration needs\)\.

Network Operations\.An agent handling telecom NOC alerts within a 5\-minute SLA\. Nine patterns selected, with Route as primary topology to classify alerts by severity and signature match\. Blast Radius Control is critical: the agent can auto\-execute remediation for P3/P4 alerts but must escalate P1/P2\. The tight time budget limits pattern depth \(Law 1\)\.

Healthcare Triage\.An agent assisting ED triage nurses with patient acuity assessment\. Seven patterns within a 60\-second budget force the Chain topology \(the simplest and fastest\)\. Generator\-Critic is parameterized with extreme asymmetry \(Law 3\): the critic is biased toward upgrading acuity, because under\-triage \(sending a critical patient to the waiting room\) is catastrophically worse than over\-triage\.

### 6\.2Five Laws of Pattern Selection

From cross\-domain comparison, five invariant principles emerge:

Law 1: Time pressure determines architectural complexity\.Days afford Hierarchy \+ Orchestrate \(10\+ patterns\); hours afford Orchestrate \(7–8\); minutes afford Route \+ Loop \(5–7\); seconds afford Chain only \(3–5\)\.*The first fix for a slow prototype is not “optimize each pattern” but “remove a pattern\.”*

Law 2: Action authority determines governance pattern\.Advisory\-only systems need Approval Gate \(human decides\)\. Systems with low\-risk auto\-execution need Blast Radius Control \(pre\-compute impact\)\. Systems with high\-risk irreversible actions need Guardrail Sandwich \(pre\- and post\-checks\)\. Mixed systems need tiered governance\.

Law 3: Failure cost asymmetry reshapes reflection\.When false positives and false negatives have symmetric costs \(lending\), the Generator\-Critic optimizes for accuracy\. When costs are extremely asymmetric \(healthcare: under\-triage is fatal\), the critic is deliberately biased toward the safe error\.

Law 4: Volume determines collaboration needs\.Single\-item processing needs no collaboration patterns\. Moderate volume \(10–50 items\) needs Fan\-Out/Gather\. High volume \(100–500\) needs Hierarchical Delegation \+ Fan\-Out/Gather\. Continuous streams need Route \+ auto\-scaling\.

Law 5: Same pattern, different parameterization\.The same pattern \(e\.g\., Generator\-Critic\) appears in all four domains but behaves differently: 5\-minute regulatory review \(lending\), 30\-second sanity check \(network\), biased safety override \(healthcare\)\.*A pattern is a structural template, not a behavioral prescription\.*The template provides the HOW; the domain provides the WHAT and WHY\.

### 6\.3Cross\-Pattern Analysis

Four patterns appear in three or more of the four case studies: Context Triage, RAG Pipeline, Complexity\-Based Routing, and Generator\-Critic\. Their ubiquity suggests these are*foundational*patterns—required by most production agent systems regardless of domain\. In contrast, Blast Radius Control and Fan\-Out/Gather appear only when specific domain constraints \(autonomous action authority, high volume\) are present, suggesting they are*conditional*patterns triggered by environmental factors\.

The framework’s empty cells also carry information\. C5 \(Reflection\) has only three populated cells out of six, making it the sparsest row\. This suggests that reflection in current agent systems is under\-explored—most systems implement only Generator\-Critic loops and have not yet developed reflection patterns using Chain, Route, or Parallel topologies\. We hypothesize that as agent systems mature, patterns such as*Parallel Reflection*\(multiple critics evaluating simultaneously\) and*Reflection Routing*\(dispatching outputs to domain\-specific critics\) will emerge\.

## 7Discussion

### 7\.1Scope and Limitations

Non\-perfect orthogonality\.Some combinations are more natural than others\. Governance patterns tend toward Route and Hierarchy topologies because gate\-keeping and containment are inherently hierarchical\. We report the framework honestly: 27 of 42 cells are populated \(64%\), and the distribution is not uniform\. The empty cells may represent genuinely impossible combinations, under\-explored territory, or artifacts of current technology limitations\.

Pattern granularity\.The 27\-pattern count reflects a judgment call about granularity\. One could split Generator\-Critic into self\-critique and cross\-model\-critique \(two patterns\) or merge Chain\-of\-Thought with Prompt Chaining \(reducing the count\)\. We chose a granularity that maximizes architectural discrimination while remaining memorable\.

Temporal validity\.The framework is designed for durability, but the specific patterns will evolve\. As reasoning models internalize Chain\-of\-Thought, the architectural Chain\-of\-Thought pattern may become less relevant—replaced by budget\-aware routing to reasoning models\. The framework accommodates this: the pattern evolves within its cell, but the coordinate system remains stable\.

### 7\.2Positioning in Software Engineering History

We position agent design patterns as the third generation of a 30\-year software engineering tradition:

1. 1\.Object\-Oriented Patterns\(1994\): Gamma et al\.\[[8](https://arxiv.org/html/2605.13850#bib.bib8)\]responded to the challenge of object composition in deterministic systems\. 23 patterns organized by purpose \(creational, structural, behavioral\) and scope\.
2. 2\.Enterprise/Distributed Patterns\(2000s\): Fowler\[[9](https://arxiv.org/html/2605.13850#bib.bib9)\], Hohpe and Woolf\[[24](https://arxiv.org/html/2605.13850#bib.bib24)\]responded to the challenge of distributed system integration\. Patterns organized by architectural layer and communication style\.
3. 3\.Agent Patterns\(2024–\): Our framework responds to the challenge of probabilistic, tool\-using, multi\-agent systems\. Patterns organized by cognitive function and execution topology\.

Each generation responded to a fundamental shift in system assumptions\. Agent patterns respond to the shift from deterministic to probabilistic execution, from compile\-time to runtime tool selection, and from single\-process to multi\-agent coordination\.

## 8Conclusion

We have presented a two\-dimensional framework that classifies AI agent design patterns along cognitive function and execution topology axes\. The7×67\\times 6matrix identifies 27 named patterns and provides a coordinate system for unambiguous pattern identification\. Our coverage evaluation across four real\-world domains demonstrates that the same pattern catalog produces structurally different architectures when different domain constraints are applied, and yields five empirical laws governing pattern selection\.

The framework is designed to be framework\-neutral, model\-agnostic, and durable\. As the underlying models and frameworks evolve, the coordinate system—what the agent needs to do \(cognitive function\) crossed with how it is structurally organized \(execution topology\)—remains stable\. We invite the community to validate, extend, and challenge this framework through additional case studies and empirical evaluation\.

An extended version with full pattern definitions, Python implementations, and additional case studies is in preparation\[[25](https://arxiv.org/html/2605.13850#bib.bib25)\]\.

## References

- \[1\]E\. Schluntz and B\. Zhang, “Building effective agents,” Anthropic Research Blog, Dec\. 2024\.
- \[2\]Google Cloud, “Agent Development Kit: A flexible framework for building multi\-agent systems,” Google Developers Blog, Apr\. 2025\.
- \[3\]H\. Chase et al\., “LangGraph: Multi\-agent workflows,” LangChain Documentation, Feb\. 2025\.
- \[4\]A\. Ng, “What’s next for AI agentic workflows,” Sequoia Capital AI Ascent, Mar\. 2024\.
- \[5\]L\. Wang, C\. Ma, X\. Feng, et al\., “A survey on large language model based autonomous agents,”*Frontiers of Computer Science*, vol\. 18, art\. 186345, 2024\.
- \[6\]Z\. Xi, W\. Chen, X\. Guo, et al\., “The rise and potential of large language model based agents: A survey,” arXiv:2309\.07864, 2023\.
- \[7\]T\. R\. Sumers, S\. Yao, K\. Narasimhan, and T\. L\. Griffiths, “Cognitive architectures for language agents,”*Foundations and Trends in Machine Learning*, vol\. 17, no\. 6, pp\. 882–971, 2024\.
- \[8\]E\. Gamma, R\. Helm, R\. Johnson, and J\. Vlissides,*Design Patterns: Elements of Reusable Object\-Oriented Software*\. Addison\-Wesley, 1994\.
- \[9\]M\. Fowler,*Patterns of Enterprise Application Architecture*\. Addison\-Wesley, 2002\.
- \[10\]Y\. Liu, S\. K\. Lo, Q\. Lu, et al\., “Agent design pattern catalogue: A collection of architectural patterns for foundation model based agents,”*J\. Systems and Software*, vol\. 220, art\. 112278, 2024\.
- \[11\]M\.\-D\. Dao, Q\. M\. Le, H\. T\. Lam, et al\., “Agentic design patterns: A system\-theoretic framework,” arXiv:2601\.19752, 2026\.
- \[12\]N\. F\. Liu, K\. Lin, J\. Hewitt, et al\., “Lost in the middle: How language models use long contexts,”*Transactions of the Association for Computational Linguistics*, vol\. 12, pp\. 157–173, 2024\.
- \[13\]P\. Lewis, E\. Perez, A\. Piktus, et al\., “Retrieval\-augmented generation for knowledge\-intensive NLP tasks,”*Advances in Neural Information Processing Systems*, vol\. 33, pp\. 9459–9474, 2020\.
- \[14\]C\. Packer, S\. Wooders, K\. Lin, et al\., “MemGPT: Towards LLMs as operating systems,” arXiv:2310\.08560, 2023\.
- \[15\]D\. Kahneman,*Thinking, Fast and Slow*\. Farrar, Straus and Giroux, 2011\.
- \[16\]I\. Ong, A\. Almahairi, V\. Wu, et al\., “RouteLLM: Learning to route LLMs with preference data,” arXiv:2406\.18665, 2024\.
- \[17\]H\. Garcia\-Molina and K\. Salem, “Sagas,”*ACM SIGMOD Record*, vol\. 16, no\. 3, pp\. 249–259, 1987\.
- \[18\]J\. Huang, X\. Chen, S\. Mishra, et al\., “Large language models cannot self\-correct reasoning yet,”*Proc\. ICLR*, 2024\.
- \[19\]Z\. Gou, Z\. Shao, Y\. Gong, et al\., “CRITIC: Large language models can self\-correct with tool\-interactive critiquing,”*Proc\. ICLR*, 2024\.
- \[20\]A\. Madaan, N\. Tandon, P\. Gupta, et al\., “Self\-Refine: Iterative refinement with self\-feedback,”*Advances in Neural Information Processing Systems*, vol\. 36, 2023\.
- \[21\]Y\. Du, S\. Li, A\. Torralba, J\. B\. Tenenbaum, and I\. Mordatch, “Improving factuality and reasoning in language models through multiagent debate,”*Proc\. ICML*, 2024\. arXiv:2305\.14325\.
- \[22\]S\. Yao, J\. Zhao, D\. Yu, et al\., “ReAct: Synergizing reasoning and acting in language models,”*Proc\. ICLR*, 2023\.
- \[23\]J\. Wei, X\. Wang, D\. Schuurmans, et al\., “Chain\-of\-thought prompting elicits reasoning in large language models,”*Advances in Neural Information Processing Systems*, vol\. 35, pp\. 24824–24837, 2022\.
- \[24\]G\. Hohpe and B\. Woolf,*Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions*\. Addison\-Wesley, 2003\.
- \[25\]J\. Huang,*Designing AI Agents: Patterns for Building Intelligent Systems*\. Manning Publications, 2026 \(in preparation\)\.

Similar Articles

Building effective agents

Anthropic Engineering

Anthropic publishes engineering guidelines for building effective AI agents, advocating for simple, composable patterns and direct API usage over complex frameworks. The article distinguishes between workflows and autonomous agents, providing practical advice on when to use each architecture.

@Kangwook_Lee: https://x.com/Kangwook_Lee/status/2052925157606568217

X AI KOLs Timeline

The author argues that human-designed structural frameworks for AI agents should be replaced by AI-engineered ones, introducing a Three Regimes Framework to show how this shift unlocks mid-sized model capabilities. Citing projects like Meta Harness, they predict an imminent transition where AI will autonomously optimize its own system architecture.

I drew the entire AI stack on one page... and it's mostly not models.

Reddit r/singularity

The author proposes a five-layer AI stack pyramid—foundations, data, models, agents, and applications—to argue that progress depends on more than just model capabilities. The article invites discussion on the placement of evaluation and interpretability within this architecture.

Dive into Claude Code: The Design Space of Today's and Future AI Agent Systems

Hugging Face Daily Papers

This paper analyzes Claude Code's architecture as an agentic coding tool, identifying five human values and thirteen design principles that inform its implementation, including safety systems, context management, and extensibility mechanisms. The study compares Claude Code with OpenClaw to demonstrate how different deployment contexts lead to different architectural solutions for common AI agent design challenges.

@djfarrelly: https://x.com/djfarrelly/status/2052779234234380479

X AI KOLs Timeline

The article argues that AI agent development should rely on stable execution primitives rather than rigid frameworks, which frequently change with emerging orchestration patterns. It emphasizes durable steps, persistent state, parallel coordination, event-driven flow, and observability to prevent costly rewrites as best practices evolve.