From Automated to Autonomous: Hierarchical Agent-native Network Architecture (HANA)

arXiv cs.AI Papers

Summary

This paper proposes a hierarchical multi-agent reference architecture called HANA for achieving Level 4/5 autonomous networks. It integrates agent self-awareness to harmonize strategic governance with reflexive fault recovery, validated in a 5G Core environment achieving 86% reduction in Mean Time to Repair.

arXiv:2605.20608v1 Announce Type: new Abstract: Realizing Level 4/5 Autonomous Networks (AN) demands a shift from static automation to agent-native intelligence. Current operations, reliant on rigid scripts, lack the cognitive agency to handle off-nominal conditions. To address this, this letter proposes a hierarchical multi-agent reference architecture enabling high-level autonomy. The framework features a Dual-Driven Orchestrator that coordinates specialized Executive Agents, supported by a shared Public Memory for unified domain knowledge. A key innovation is the integration of agent self-awareness, which empowers the system to harmonize deliberative strategic governance with reflexive fault recovery. We instantiate and validate this architecture within a 5G Core environment. Case studies demonstrate that the system sustains critical throughput under congestion and reduces Mean Time to Repair (MTTR) by 86%, confirming its efficacy in unifying strategic planning with operational resilience.
Original Article
View Cached Full Text

Cached at: 05/22/26, 08:47 AM

# From Automated to Autonomous: Hierarchical Agent-native Network Architecture (HANA)
Source: [https://arxiv.org/html/2605.20608](https://arxiv.org/html/2605.20608)
Binghan Wu, , Shoufeng Wang, , Yunxin Liu, , Ya\-Qin Zhang, , Joseph Sifakis, and Ye OuyangBinghan Wu, Shoufeng Wang, and Ye Ouyang are with AsiaInfo Technologies Limited, Beijing, China \(e\-mail: \{wubh3, wangsf11, ye\.ouyang\}@asiainfo\.com\)\. Yunxin Liu and Ya\-Qin Zhang are with the Institute for AI Industry Research \(AIR\), Tsinghua University, Beijing, China \(e\-mail: \{liuyunxin, zhangyaqin\}@air\.tsinghua\.edu\.cn\)\. Ye Ouyang is also with the Institute for AI Industry Research \(AIR\), Tsinghua University, Beijing, China\. Joseph Sifakis is with Verimag, Université Grenoble Alpes, Grenoble, France \(e\-mail: joseph\.sifakis@univ\-grenoble\-alpes\.fr\)\.

###### Abstract

Realizing Level 4/5 Autonomous Networks \(AN\) demands a shift from static automation to agent\-native intelligence\. Current operations, reliant on rigid scripts, lack the cognitive agency to handle off\-nominal conditions\. To address this, this letter proposes a hierarchical multi\-agent reference architecture enabling high\-level autonomy\. The framework features a Dual\-Driven Orchestrator that coordinates specialized Executive Agents, supported by a shared Public Memory for unified domain knowledge\. A key innovation is the integration of agent self\-awareness, which empowers the system to harmonize deliberative strategic governance with reflexive fault recovery\. We instantiate and validate this architecture within a 5G Core environment\. Case studies demonstrate that the system sustains critical throughput under congestion and reduces Mean Time to Repair \(MTTR\) by 86%, confirming its efficacy in unifying strategic planning with operational resilience\.

## IIntroduction

Realizing Level 4/5 Autonomous Networks \(AN\) demands a paradigm shift from static automation to agent\-native intelligence\. While the industry aims for “Zero\-X” experiences\[[4](https://arxiv.org/html/2605.20608#bib.bib1),[3](https://arxiv.org/html/2605.20608#bib.bib11)\], current human\-in\-the\-loop operations struggle with heterogeneous network complexity\. Existing mechanisms \(e\.g\., AIOps\[[10](https://arxiv.org/html/2605.20608#bib.bib2)\], SON\[[2](https://arxiv.org/html/2605.20608#bib.bib3)\], SDN orchestration\[[8](https://arxiv.org/html/2605.20608#bib.bib4)\]\) act as passive, static controllers\. They manage nominal events but lack the cognitive agency to proactively address unforeseen disruptions, leaving a critical gap between automated execution and true autonomous cognition\.

To bridge this gap, we propose the Hierarchical Agent\-native Network Architecture \(HANA\) \(see Figure[1](https://arxiv.org/html/2605.20608#S1.F1)\)\. Unlike recent intent\-driven architectures that merely execute high\-level instructions without context awareness\[[6](https://arxiv.org/html/2605.20608#bib.bib9),[7](https://arxiv.org/html/2605.20608#bib.bib10)\], HANA empowers the network withintrinsic problem\-solving capabilities\. This autonomy is realized by a cognitive architecture grounded in the dual\-process theory of “slow” and “fast” thinking\[[9](https://arxiv.org/html/2605.20608#bib.bib12)\]:

- •Internal Drive \(Slow Thinking\):The system’s intrinsic agency is fundamentally derived from anInternal Driveakin to “slow,” deliberative cognition, responsible for long\-term strategic governance and proactive optimization\. In our design, this is operationalized by aSelf\-awarenessmodule that actively maintains internal intent \(e\.g\., service optimality\)\. When it detects a deviation between the current network state and these goals—even in the absence of external faults—it autonomously initiates predictive planning to rectify the trend before performance degrades\.
- •External Drive \(Fast Thinking\):To complement this deliberation, we superimpose anExternal Drivemimicking “fast” reactive reflexes for immediate survival\. Triggered by critical environmental alerts, this mechanism bypasses the complex reasoning loop, directing Executive Agents to execute pre\-validated remedial actions for millisecond\-scale fault mitigation\.

By integrating these two mechanisms, HANA establishes a Dual\-Driven Orchestrator Agent that harmonizes long\-term strategic governance with immediate operational resilience\. This agent operationalizes the dual\-process theory through distinct cognitive pathways\. For theInternal Drive, the Orchestrator interacts withLong\-term Memoryto retrieve system states and constraints, generating an initial meta\-goal\. This meta\-goal is processed by the Decision Making module, which employs predictive cost\-benefit analysis to formulate a precise internal goal\. Conversely, theExternal Driveresponds to external stimuli: the agent synthesizes perception data with the current system context to define an urgent event\. Architecturally, HANA achieves a strict decoupling of planning and execution\. The Orchestrator functions as the central planner, dispatching the generated goals from the Internal Drive or the External Drive to specialized executive agents\. Acting as domain experts, these agents receive the high\-level directives and perform localized utility analysis to translate them into concrete, atomic Actions via the Intelligent Toolbox\.

The main contributions of this letter are: \(1\) We propose a hierarchical, agent\-native reference architecture that transitions network management from tool\-assisted automation to autonomous problem\-solving\. \(2\) We present a novel dual\-driven cognitive framework that decouples strategic cognition from execution, unifying long\-term intrinsic intent with short\-term surviving need\. \(3\) We validate HANA in a 5G Core environment\. Case studies demonstrate that the system sustains critical throughput under congestion and reduces Mean Time to Repair \(MTTR\) by 86% compared to manual O&M, confirming its efficacy in real\-world scenarios\.

![Refer to caption](https://arxiv.org/html/2605.20608v1/x1.png)Figure 1:Overview of the proposed Hierarchical Agent\-native Network Architecture \(HANA\)\. The framework features a Dual\-Driven Orchestrator that harmonizes “slow” Internal Drive \(blue flow, a\-d\) and “fast” External Drive \(red flow, 1\-2\)\. It coordinates specialized Executive Agents via the A2A protocol, utilizing a shared Public Memory and an Intelligent Toolbox to achieve a closed\-loop perception\-cognition\-execution cycle for Autonomous Networks\.
## IIHierarchical Agent\-native Network Architecture \(HANA\)

### II\-AArchitecture Overview

As illustrated in Fig\.[1](https://arxiv.org/html/2605.20608#S1.F1), HANA is structured into three logically distinct layers, establishing a closed loop from perception to cognition to execution\.

- •Public Memory & knowledge Layer \(Top\): This layer acts as the unified knowledge base\. It aggregates real\-timeNetwork Performance MetricsandEvents\. It also maintains aPublic Memoryaccessible via the Model Context Protocol \(MCP\), storing bothGeneral KnowledgeandDomain Knowledge\.To ensure resilience, the Public Memory adopts a logically centralized but physically distributed design\[[1](https://arxiv.org/html/2605.20608#bib.bib13)\]\. Its lifecycle management—encompassing continuous updating, snapshot\-based versioning, and conflict resolution—strictly adheres to industry standards\[[5](https://arxiv.org/html/2605.20608#bib.bib14)\]\.
- •Cognitive Core Layer \(Middle\): This is the locus of autonomy, hosting theDual\-Driven Orchestrator Agentand specializedExecutive Agents\(e\.g\., Service Assurance Agent\)\. Agents collaborate via the Agent\-to\-Agent \(A2A\) protocol to deliberate and decide\.
- •Intelligent Toolbox \(Bottom\): This layer bridges cognitive intents to telecom\-grade operations\. It encapsulates atomic functions \(e\.g\., Access & Session Control, Policy & Charging\), allowing agents to execute commands safely through standardized interfaces\.

### II\-BThe Dual\-Driven Orchestrator: Internal and External Drive

The core innovation lies in the Orchestrator Agent, which implements a dual\-driven cognitive model enabling agents to balance immediate reflexes with long\-term strategic planning\.

The Internal Drive\(blue arrows a–d in Fig\.[1](https://arxiv.org/html/2605.20608#S1.F1)\) acts as the strategic planner\. Driven by Self\-awareness, the agent first retrieves operational constraints and task\-specific context from its Private Memory \(step a\)\. It reflects on this internal state to generate a Meta\-goal \(step b\)\.The Meta\-goal is a persistent, strategic intent stored in Private Memory, and it guides the Choice Making module\.By incorporating forward\-looking Predictions derived from situation awareness \(step c\), the module formulates high\-level strategic objectives\. These objectives are finally transmitted as a Internal Goal \(step d\) via the A2A protocol\.The Internal Goal is a transient, tactical task description\.This extensive reasoning cycle embodies a “slow thinking” paradigm that deliberately sacrifices speed to weigh long\-term utility and ensure global optimality\.

The External Drive\(red arrows 1–2 in Fig\.[1](https://arxiv.org/html/2605.20608#S1.F1)\) operates in parallel to handle immediate threats\. Upon detecting an alert, the Situation Awareness module queries memory layers to retrieve system context and logs the active state \(step 1\)\. It then performs a preliminary diagnosis, encapsulating the system state, alert data, and diagnostic results into a Reactive State\-Based Event\. This event is immediately transmitted via the A2A protocol to a specialized Executive Agent \(step 2\)\. This short\-circuited pathway enables a “fast thinking” reflex that bypasses complex goal reasoning to ensure millisecond\-scale mitigation\.

### II\-CExecutive Agents and Closed\-Loop Execution

Executive Agents translate Orchestrator directives via a rigorous Decision Making process\. Upon receiving an input \(step i\), the agent synthesizes it with historical context \(step ii\) and real\-time state \(step iii\)\. The Goal Management module prioritizes tasks\.To resolve conflicts between drives, HANA employs a strict priority\-based scheduling mechanism\. When the External Drive detects a ‘hard constraint’ violation, the resulting event preempts and pauses any ongoing Internal Goals until the fault is mitigated and safe boundaries are restored\.Then, the Planning module generates a concrete Plan \(step iv\)\. During Execution, the agent converts this plan into atomic Tool Executions, sending commands to the Intelligent Toolbox via the MCP \(step v\)\. Finally, collecting Results \(step vi\) updates the Private Memory \(step vii\), ensuring continuous adaptation\.

## IIICase Studies

In this section, we present two distinct case studies to validate the autonomous, closed\-loop capabilities of HANA\. Case Study A highlights the proactive operation, where the Orchestrator utilizes Self\-awareness to anticipate risks and drive strategic optimization\. Case Study B demonstrates the reactive mechanism, showcasing how the system executes millisecond\-scale self\-healing in response to critical faults\.

### III\-AKey Terminal Proactive Service Assurance

This case study validates the architecture’s Internal Drive \(process a–d and i–Vii in Fig\.[1](https://arxiv.org/html/2605.20608#S1.F1)\), focusing on maintaining stringent Service Level Agreements \(SLAs\) for critical terminals in demanding scenarios, such as industrial IoT and autonomous systems\. In these environments, dynamic network congestion is a primary threat\. Traditional reactive operations—triggered only after a throughput collapse or service interruption—are fundamentally inadequate, as post\-failure mitigation is often too late to prevent mission failure\. Therefore, proactive, goal\-driven actions that anticipate network risks are necessary\. HANA addresses this imperative by leveraging its core cognitive capability, enabling agents to use self\-awareness to forecast potential SLA violations and preemptively orchestrate service assurance\.

The workflow is initiated by the Orchestrator Agent, responsible for continuously steering network state toward long\-term strategic goals\. The agent’sSituation Awarenessmodule \(via Perception\) continuously ingests telemetry data, specifically monitoring cell\-level load trends and correlating them with the presence of high\-priority VIP user sessions\. Crucially, the agent’sSelf\-awarenesscomponent actively evaluates these observed, worsening state trends against its maintainedAgent Purpose—specifically, the meta\-goal to ensure a critical application’s throughput remains above the 2 Mbps lower bound dynamically retrieved from the VIP terminal’s SLA requirement in the Private Memory\.When thePredictive Modelforecasts a high probability of an imminent SLA violation if no action is taken, the agent’sChoice Makinglogic is triggered\. Instead of waiting for a failure, it synthesizes the risk prediction with its internal profile to generate a new Meta\-goal: “Execute Preemptive Service Assurance for the VIP Terminal\.”

The Orchestrator Agent transmits this objective as a Proactive Goal via the A2A Protocol to the specialized Service Assurance Agent, establishing the initial task context and defining resource guardrails\. Upon receiving this goal, the Assurance Agent operates within its proactive behavior loop\. ItsGoal ManagementandPlanningmodules formulate a comprehensive optimization strategy\. This strategy involves preemptively selecting the optimal Next\-Generation QoS Identifier \(NG\-QI\), dynamically elevating the critical flow’s priority, and reserving the required guaranteed\-bitrate allocation to mitigate the impending congestion peak\. The agent translates this high\-level policy into concrete network configuration changes via the Model Context Protocol \(MCP\) and the Intelligent Toolbox, ensuring the measure is applied well in advance of the most critical load surge\.

![Refer to caption](https://arxiv.org/html/2605.20608v1/x2.png)Figure 2:Video upload rate during congestion for terminals with /without agent\-based assurance and traditional rule\-based script\.To validate this, we simulated a congestion scenario for a 2 Mbps VIP surveillance camera, comparing HANA against an unprotected baseline and a traditional rule\-based script\. As shown in Fig\.[2](https://arxiv.org/html/2605.20608#S3.F2), the unprotected terminal \(blue line\) suffered a throughput collapse to∼\\sim0\.25 Mbps\. The rule\-based script \(green line\) is inherently reactive; its requisite monitoring debouncing to filter transient fluctuations, and sequential execution introduce a∼\\sim30\-second latency\. Consequently, it intervenes onlyafterthe SLA is breached, leaving a distinct disruption window\. In sharp contrast, HANA \(red line\) leverages its predictive model to anticipate SLA risks, preemptively orchestrating resource reservationsbeforecongestion impacts the terminal\. This zero\-degradation proactive intervention confirms HANA’s cognitive superiority over the post\-fault mitigation of traditional automation\.

### III\-BCore Network Self\-Healing

This case study validates the architecture’s External Drive \(process 1–2 and i–Vii in Fig\.[1](https://arxiv.org/html/2605.20608#S1.F1)\), addressing the critical need for rapid, automated recovery from hidden core network faults\. Modern telecom core networks rely on multi\-technology deployments, making them vulnerable to complex issues such as configuration errors or resource exhaustion\. Traditional O&M relies heavily on manual troubleshooting, which is slow, skill\-dependent, and inefficient as network scale grows\. To minimize the impact of outages on user experience, rapid, autonomous fault recovery has become essential\.

The self\-healing workflow is triggered when the Orchestrator’sSituation Awarenessmodule detects a critical anomaly\. Consider an instance where the monitoring system raises an “HTTP Connection Resource Exhaustion” alarm in a core network Session Management Function \(SMF\) node\. The agent immediately captures this alert, along with relevant network element identifiers and service address details \(e\.g\., 11\.12\.13\.114\)\. In this “fast thinking” reflex loop, the agent bypasses complex strategic planning\. It immediately queriesPublic Memory\(Domain Knowledge\) to retrieve historical fault cases and configuration templates\. Through pattern matching and contextual analysis, the agent performs a preliminary diagnosis and identifies the root cause: the current maximum HTTP connection setting is 100, significantly lower than the recommended value of 1000derived by matching historical experience in the Private Memory\.

Instead of triggering a long planning cycle, the Orchestrator encapsulates the system state and this diagnostic result into a Reactive State\-Based Event\. This event is immediately forwarded via the A2A Protocol to the specialized Self\-Healing Executive Agent\. Upon receipt, this agent translates the event into a remedial plan: restore normal connection availability without disrupting services\. It triggers atomic actions via the Model Context Protocol \(MCP\) and theIntelligent Toolboxto automatically update the maximum HTTP connection parameter to 1000 and perform a graceful configuration reload\. The system then continues to monitor connection usage to confirm resolution\.

Agent StatusMTTR \(min\)Imp\(%\)DispatchAnalysisResolutionTotalFailure: AMF Node UnreachableNo Agent130536–Rule\-Based11511752\.78With Agent131586\.11Failure: HTTP Connection Resources InsufficientNo Agent110314–Rule\-Based151750\.00With Agent111378\.57Failure: Total Session Capacity Level\-1 AlarmNo Agent1101021–Rule\-Based15101623\.81With Agent11101242\.86TABLE I:MTTR comparison with/without agent and traditional rule\-based script\.To evaluate the agent’s self\-healing capability, we tested three typical core network failure scenarios: \(1\)Access and Mobility Management Function \(AMF\) Node Unreachable, indicating a critical access management function failure affecting user connectivity; \(2\)HTTP Connection Resources Insufficient, reflecting resource exhaustion in control plane communication; \(3\)Total Session Capacity Level\-1 Alarm, representing service capacity saturation requiring immediate expansion\.

Table I evaluates HANA against manual operations \(“No Agent”\) and a rule\-based sequential decision tree \(e\.g\., Ping checks→\\rightarrowPod status checks→\\rightarrowLink checks\)\. While accelerating execution, the rule\-based script suffers from deterministic traps during analysis: it wastes time on rigid, irrelevant checks before ultimately suspending for human intervention\. Conversely, HANA’s “fast thinking” reflex leverages Public Memory for global feature matching, bypassing sequential steps to directly pinpoint root causes\. Consequently, HANA slashed AMF fault analysis time to just 3 minutes, compared to 15 minutes for the rule\-based script and 30 minutes manually\. This confirms HANA elevates “tool\-based automation” to “cognitive autonomy,” effectively eliminating the root cause analysis bottleneck\.

## IVConclusion and Discussion

HANA establishes a hierarchical, agent\-native architecture for autonomous networks, successfully decoupling strategic cognition from execution\. By instantiating a dual\-driven model within the Orchestrator, the system harmonizes the “slow thinking” required for long\-term goal management with the “fast thinking” needed for immediate fault resilience\. Case studies confirm the architecture’s efficacy: sustaining VIP SLAs under congestion through proactive governance and reducing fault repair times by up to 86% through reactive reflexes\. These results validate that embedding self\-awareness and distinct cognitive flows is essential for transitioning from automated to truly autonomous networks\.

HANA’s hierarchical design is inherently tailored for scalability\. Executive Agents can scale horizontally, and each Orchestrator is scoped to its domain to prevent centralized bottlenecks\. However, engineering boundaries remain: utilizing Large Language Models \(LLMs\) introduces inherent inference latency for highly concurrent strategic goals, and achieving seamless cross\-domain coordination requires further standardized intent translation mechanisms\.Future work will focus on extending this hierarchical framework to cross\-domain scenarios, coordinating autonomous behaviors across Core, RAN, and Transport domains to achieve end\-to\-end network autonomy\.

## References

- \[1\]\(2024\-06\)System architecture for the 5g system \(5gs\)\.Technical Specification \(TS\)Technical Report23\.501,3rd Generation Partnership Project \(3GPP\)\.Note:Version 18\.5\.0External Links:[Link](https://www.3gpp.org/ftp/Specs/archive/23_series/23.501/)Cited by:[1st item](https://arxiv.org/html/2605.20608#S2.I1.i1.p1.1.6)\.
- \[2\]3GPP\(2020\-10\)Self\-organizing networks \(son\) for 5g networks\.Technical SpecificationTechnical ReportTS 28\.313,3GPP\(English\)\.Note:Release 16External Links:[Link](https://www.etsi.org/deliver/etsi_ts/128300_128399/128313/16.00.00_60/ts_128313v160000p.pdf)Cited by:[§I](https://arxiv.org/html/2605.20608#S1.p1.1)\.
- \[3\]A\. R\. E\. Boasman\-Patel, S\. Dong, Y\. Wang, C\. Maitre, J\. Domingos, Y\. Troullides, I\. Mas, G\. Traver, and G\. Lupo\(2019\-05\-15\)Autonomous networks: empowering digital transformation for the telecoms industry\.WhitepaperTechnical ReportRelease 1\.0,TM Forum,inform\.tmforum\.org\.External Links:[Link](https://www.tmforum.org/wp-content/uploads/2019/05/22553-Autonomous-Networks-whitepaper.pdf)Cited by:[§I](https://arxiv.org/html/2605.20608#S1.p1.1)\.
- \[4\]L\. Huidi, O\. Ye, and J\. Sifakis\(2025\-05\-24\)Autonomous networks driving the progress of telecom sector\.China Daily\.Note:Updated: 2025\-05\-24 08:41External Links:[Link](https://www.chinadaily.com.cn/a/202505/24/WS683115ada310a04af22c1489.html)Cited by:[§I](https://arxiv.org/html/2605.20608#S1.p1.1)\.
- \[5\]Cited by:[1st item](https://arxiv.org/html/2605.20608#S2.I1.i1.p1.1.6)\.
- \[6\]A\. Leivadeas and M\. Falkner\(2022\)A survey on intent\-based networking\.IEEE Communications Surveys & Tutorials25\(1\),pp\. 625–655\.Cited by:[§I](https://arxiv.org/html/2605.20608#S1.p2.1)\.
- \[7\]X\. Li, W\. Shi, H\. Zhang, C\. Peng, S\. Wu, and W\. Tong\(2025\)The agentic\-ai core: an ai\-empowered, mission\-oriented core network for next\-generation mobile telecommunications\.Engineering\.Cited by:[§I](https://arxiv.org/html/2605.20608#S1.p2.1)\.
- \[8\]I\. Ullah, A\. Arishi, S\. K\. Singh, F\. Alharbi, A\. H\. Ibrahim, M\. Islam, Y\. I\. Daradkeh, and C\. Choi\(2025\)Autonomous network management for 6g communication: a comprehensive survey\.Digital Communications and Networks\.External Links:ISSN 2352\-8648,[Document](https://dx.doi.org/https%3A//doi.org/10.1016/j.dcan.2025.07.001),[Link](https://www.sciencedirect.com/science/article/pii/S2352864825001129)Cited by:[§I](https://arxiv.org/html/2605.20608#S1.p1.1)\.
- \[9\]Walter and Krämer\(2014\)Kahneman, d\. \(2011\): thinking, fast and slow\.Statistical Papers\.Cited by:[§I](https://arxiv.org/html/2605.20608#S1.p2.1)\.
- \[10\]Y\. Yang, S\. Yang, C\. Zhao, and Z\. Xu\(2024\)TelOps: ai\-driven operations and maintenance for telecommunication networks\.IEEE Communications Magazine62\(4\),pp\. 104–110\.External Links:[Document](https://dx.doi.org/10.1109/MCOM.003.2300055)Cited by:[§I](https://arxiv.org/html/2605.20608#S1.p1.1)\.

Similar Articles

The Autonomous Stack

Product Hunt

A production-ready architecture framework for building autonomous agents using Claude, shared on Product Hunt.