ORCA: An End-to-End Interactive Copilot for Optimized Root Cause Analysis

arXiv cs.AI 05/27/26, 04:00 AM Papers
causal-analysis root-cause-analysis copilot large-language-model interactive workshop-paper
Summary
ORCA is a copilot for end-to-end causal analysis that uses agents to guide users through workflows including causal discovery, effect estimation, and root cause analysis, with structured reports.
arXiv:2605.27022v1 Announce Type: new Abstract: Causal analysis is a crucial task in many domains, including manufacturing, social science, and medicine. However, despite recent progress, the conceptual and methodological complexity of causal methods makes them largely inaccessible to domain experts. This gap prevents experts from leveraging these advances and hinders researchers who lack access to real-world data for validation. To bridge this divide, we introduce ORCA, a copilot for end-to-end causal analysis. ORCA orchestrates agents to understand the user's goals and guide them through the most appropriate causal analysis workflow, from fully automatic to highly user-guided execution. It features causal discovery, causal effect estimation, explainability and Root-Cause-Analysis (RCA). ORCA evaluates and compares performance, generates key metrics and diagrams, and generates insights through structured reports. We highlight its effectiveness across several real-world use-cases.
Original Article
View Cached Full Text
Cached at: 05/27/26, 09:10 AM
# ORCA: An End-to-End Interactive Copilot for Optimized Root Cause Analysis
Source: [https://arxiv.org/html/2605.27022](https://arxiv.org/html/2605.27022)
\\copyrightclause

Copyright for this paper by its authors\. Use permitted under Creative Commons License Attribution 4\.0 International \(CC BY 4\.0\)\.

\\conference

2nd Causal Neuro\-Symbolic Artificial Intelligence \(Causal NeSy\) workshop, May 10–11 2026, Dubrovnik, Croatia

\[email=Juergen\.Luettin@de\.bosch\.com \]

Nicholas TagliapietraLavdim HalilajKristian KerstingJuergen LuettinRobert Bosch GmbH, GermanyBosch Global Software Technologies Company Limited, VietnamComputer Science Department, TU Darmstadt, GermanyHessian Center for Artificial Intelligence \(hessian\.AI\), DarmstadtGerman Center for Artificial Intelligence \(DFKI\)

\(2026\)

###### Abstract

Causal analysis is a crucial task in many domains, including manufacturing, social science, and medicine\. However, despite recent progress, the conceptual and methodological complexity of causal methods makes them largely inaccessible to domain experts\. This gap prevents experts from leveraging these advances and hinders researchers who lack access to real\-world data for validation\. To bridge this divide, we introduce ORCA, a copilot for end\-to\-end causal analysis\. ORCA orchestrates agents to understand the user’s goals and guide them through the most appropriate causal analysis workflow, from fully automatic to highly user\-guided execution\. It features causal discovery, causal effect estimation, explainability and Root\-Cause\-Analysis \(RCA\)\. ORCA evaluates and compares performance, generates key metrics and diagrams, and generates insights through structured reports\. We highlight its effectiveness across several real\-world use\-cases\.

###### keywords:

causal discovery\\sepcausal inference\\seproot cause analysis\\sepcopilot\\seplarge language model

## 1Introduction

Causal discovery and inference is a key to scientific understanding and decision making\. It enables us to move beyond associational patterns and to uncover the underlying mechanisms that govern observed phenomena\[[1](https://arxiv.org/html/2605.27022#bib.bib1)\]\. By performing a causal analysis, we can reveal how system variables interact and influence outcomes\. Therefore, it plays a crucial role across a wide range of application domains, including healthcare\[[2](https://arxiv.org/html/2605.27022#bib.bib2),[3](https://arxiv.org/html/2605.27022#bib.bib3),[4](https://arxiv.org/html/2605.27022#bib.bib4)\], economics\[[5](https://arxiv.org/html/2605.27022#bib.bib5)\], health science\[[6](https://arxiv.org/html/2605.27022#bib.bib6)\], genetics\[[7](https://arxiv.org/html/2605.27022#bib.bib7),[8](https://arxiv.org/html/2605.27022#bib.bib8)\], manufacturing\[[9](https://arxiv.org/html/2605.27022#bib.bib9),[10](https://arxiv.org/html/2605.27022#bib.bib10)\], and IT networks\[[11](https://arxiv.org/html/2605.27022#bib.bib11),[12](https://arxiv.org/html/2605.27022#bib.bib12)\]\. For example, in medical research we are interested in uncovering the causes of a disease, in epidemiology we seek for causal relationships between environmental factors and diseases, and in manufacturing we aim to identify root causes of defects to optimize production\.

Despite its importance, modeling causality remains highly challenging, which is why research often stops at the associational level\[[6](https://arxiv.org/html/2605.27022#bib.bib6),[9](https://arxiv.org/html/2605.27022#bib.bib9),[10](https://arxiv.org/html/2605.27022#bib.bib10),[11](https://arxiv.org/html/2605.27022#bib.bib11)\]\. In the health domain, for example, although Randomized Control Trials \(RTC\)\[[13](https://arxiv.org/html/2605.27022#bib.bib13)\]are the standard, they are often time\-consuming, costly, unethical and/or unfeasible\. Within the Pearlian framework, instead, Structural Causal Models\[[14](https://arxiv.org/html/2605.27022#bib.bib14),[1](https://arxiv.org/html/2605.27022#bib.bib1)\]permit to have a graph representation of the causal mechanisms for the phenomena being modeled, which can often be learned from observational data alone if certain identifiability assumptions are fulfilled\. But even in the latter case, despite their potential, the usability of such causal models is burdened by high conceptual and methodological complexity\. Consequently, real\-world applications often remain scattered unless a deep expertise is available\. Broader real\-world adoption, indeed, is still not achieved\.

A typical causal application requires a long and complex and task\-dependent pipeline, which includes domain description, data cleaning and preprocessing, causal discovery, effect estimation, and root cause analysis\. Each phase requires technical decisions to be impactful, from selecting the appropriate algorithms to tuning hyperparameters and defining evaluation metrics\. This intersection of deep methodological expertise and domain\-specific knowledge is rarely found in a single practitioner, exacerbating the adoption of advanced causal methods in real\-world scenarios\.

This adoption gap is further widened by the difficulty of evaluating causal discovery methods\. Unlike standard machine learning tasks, causal tasks often lack the standardized benchmarks and evaluation procedures agreed by the community\. Moreover, the ground\-truth causal\-graph is rarely available in practice, forcing researchers to rely heavily on synthetic datasets\[[15](https://arxiv.org/html/2605.27022#bib.bib15)\]or evaluating on downstream tasks\[[16](https://arxiv.org/html/2605.27022#bib.bib16)\]\. Root cause analysis benchmarks are even scarcer\[[17](https://arxiv.org/html/2605.27022#bib.bib17)\]\. Because fully automated solutions cannot be easily validated against a ground truth, effective causal analysis fundamentally requires a human\-in\-the\-loop approach where domain experts iteratively inject structural knowledge and validate assumptions, a process not supported by current tooling\.

To bridge the gap between the high demand of causal analysis solutions and its high barrier to entry, we propose ORCA , an interactive agentic copilot designed to assist end\-to\-end causal analysis\. Users can either actively guide the workflow or rely on the copilot’s recommendations, removing technical barriers and enabling broader adoption111A demo video will be made available\.\.

#### Contribution:

Our contributions are summarized as:

- •We describe challenges of real\-world scenarios and extract a set of requirements derived from these scenarios\.
- •We present ORCA, our conversational AI\-assistant for Causal Analysis developed to address such requirements\. We detail our vision, its architecture and further elaborate the main workflow\.
- •We explore real\-world use\-cases where ORCA can be applied, highlighting its practical advantages and simplicity\.

## 2Motivating Scenario and Requirements

Real\-world systems exhibit complex causal phenomena that are challenging to model due to their causal structure, high dimensionality and structural mechanisms\. Consequently, the selection of an optimal modeling pipeline becomes exceedingly challenging for the domain expert\. We have therefore collected requirements from various scenarios, partially extending those in\[[18](https://arxiv.org/html/2605.27022#bib.bib18)\]\.

1. Req\. 1End\-to\-End Causal Workflow:The system must dynamically build the most appropriate causal analysis pipeline, from domain definition, data cleaning, preprocessing, to causal discovery, causal effect estimation, RCA, visualization, and report generation\.
2. Req\. 2Intuitive Interaction and Guidance:ORCA should be friendly to non\-experts in causality, it should feature a conversational, natural language interaction via UI\. It must actively guide naive users, propose the next steps, and adaptively incorporate user feedback throughout the pipeline\.
3. Req\. 3Data Security and Privacy:Many applications of causality are characterized by highly sensitive data such as clinical or manufacturing data or intellectual properties\. The system must therefore ensure strict data\-privacy and robust role\-based access\-control\.
4. Req\. 4Integration of Domain Knowledge:To overcome the limitations of observational data, the system must allow to inject of domain knowledge \(e\.g\., prohibited or required causal relationships\) sourced from the user, existing documents or LLM\-based queries\.
5. Req\. 5Interpretability and Traceability:The system should provide interpretable results, describe the individual stages in diagrams and allow the user to visualize intermediate results, and clearly explain the reasoning behind its algorithmic choices and root cause identifications\.
6. Req\. 6Algorithmic Recommendation and Automation:The system must provide access to state\-of\-the\-art \(SOTA\) causal methods\. It should recommend, starting on AutoML principles, the most appropriate algorithms based on dataset characteristics and user preferences\.

## 3Related Work

Traditional causal analysis requires the manual choice of discovery and inference algorithms\[[19](https://arxiv.org/html/2605.27022#bib.bib19),[20](https://arxiv.org/html/2605.27022#bib.bib20),[17](https://arxiv.org/html/2605.27022#bib.bib17)\], and repeated iterations with domain experts in order to validate results\. Recently, however, research started focusing toward automating and assisting these workflows using agentic systems\.

Causal Reasoning driven by LLMsCausal Reasoning capabilities on LLMs have been explored in\[[21](https://arxiv.org/html/2605.27022#bib.bib21),[22](https://arxiv.org/html/2605.27022#bib.bib22)\], essentially concluding that their capabilities are likely due to the memorization of cause\-effect pairs, and showing how they lack generalization\. Nonetheless,\[[23](https://arxiv.org/html/2605.27022#bib.bib23),[24](https://arxiv.org/html/2605.27022#bib.bib24),[25](https://arxiv.org/html/2605.27022#bib.bib25),[26](https://arxiv.org/html/2605.27022#bib.bib26),[27](https://arxiv.org/html/2605.27022#bib.bib27),[28](https://arxiv.org/html/2605.27022#bib.bib28)\]argue and show their potential in aiding causal discovery tasks by leveraging their internal prior, especially in real\-world scenarios where causal assumptions and identifiability fail\. Finally, in\[[29](https://arxiv.org/html/2605.27022#bib.bib29),[30](https://arxiv.org/html/2605.27022#bib.bib30)\]authors show the efficacy of LLMs in extracting causal relationships from unstructured textual data\.

Human\-in\-the\-Loop and Tool\-Instructed Copilots for Data\-ScienceCopilots for clinical predictive modeling have been introduced in\[[18](https://arxiv.org/html/2605.27022#bib.bib18)\]and\[[31](https://arxiv.org/html/2605.27022#bib.bib31)\], which was later extended to treatment effect estimation in\[[32](https://arxiv.org/html/2605.27022#bib.bib32)\]\. A first copilot for causal analysis within the Pearlian framework of causality has been proposed in\[[33](https://arxiv.org/html/2605.27022#bib.bib33)\], which guides the user through causal discovery, causal inference, and causal explanations\. Within the \(non\-causal\) Explainability and RCA domain,\[[34](https://arxiv.org/html/2605.27022#bib.bib34)\]combines LLMs and granger causality \(\[[35](https://arxiv.org/html/2605.27022#bib.bib35)\]\) for RCA in network incidents, Further, a neuro\-symbolic causal analysis agent for manufacturing has been presented in\[[36](https://arxiv.org/html/2605.27022#bib.bib36),[37](https://arxiv.org/html/2605.27022#bib.bib37)\]\. Finally, an approach for the more general task of causal effect estimation based on an LLM\-augmented causal tool has been presented in\[[38](https://arxiv.org/html/2605.27022#bib.bib38)\]\.

In our vision, ORCA extends existing works in interactive and LLM\-assisted causal reasoning by filling the gaps and integrating \(1\) SOTA causal discovery and RCA methods, \(2\) automated method recommendation, \(3\) scalable to use cases with partial causal information and/or without a causal graph, \(4\) integration of domain knowledge from both structured and unstructured sources\.

![Refer to caption](https://arxiv.org/html/2605.27022v1/Images/architecture.png)Figure 1:ORCA is an assistant for Causal Analysis\. By interacting with the user and its provided data and information \(left\), it orchestrates the execution of the most appropriate workflow for a specific causal analysis task\. It features a wide variety of methods \(center\) and generates reports tailored to the user needs \(right\)\.
## 4ORCA

We designed ORCA to address the outlined requirements\. A conceptual overview is shown in Fig\.[1](https://arxiv.org/html/2605.27022#S3.F1)comprising there main pillars: 1\) User Input \- offering the possibilities to retrieve domain knowledge, observational data as well as given queries; 2\) Conversational Assistant \- built on top of a multi\-agent framework to streamline the interaction across various integrated modules; and 3\) Generated Output \- responsible for providing the results in various formats and visualization options\.

### 4\.1System Architecture

The architecture of ORCA is designed to provide a robust, no\-code environment for causal analysis, directly addressing the needs of domain experts\. As illustrated in Fig\.[2](https://arxiv.org/html/2605.27022#S4.F2), the framework can be divided into \(1\) User Interaction, that takes care of task management, chat service and LLM backbone, \(2\) Workflow Management, responsible for the whole workflow chain with user interaction and execution of Individual Agents \(3\)\.

![Refer to caption](https://arxiv.org/html/2605.27022v1/Images/workflow.png)Figure 2:Multi\-agent architecture illustrating the workflow management backbone \(middle\) that orchestrates the whole workflow with user interaction \(top\) and executing required individual agents \(bottom\)\.#### Request Management Unit \([Req\. 1](https://arxiv.org/html/2605.27022#S2.I1.i1)\)\.

ORCA manages workflow states via a centralized orchestration mechanism\. The system routes API calls to remotely hosted LLMs while offloading intensive algorithmic workloads to available GPU clusters\. A central Generation Unit dynamically translates the orchestrator’s high\-level reasoning into executable Python scripts, securely bridging the gap between natural language intents and the underlying state\-of\-the\-art \(SOTA\) causal libraries\.

#### Computation Time Estimation \([Req\. 2](https://arxiv.org/html/2605.27022#S2.I1.i2)\)\.

Algorithms involving causal models are in large part intractable\. Especially causal discovery and complex RCA algorithms scale poorly with the dimensionality of the data, resulting in explosion of processing times\. To mitigate this critical operational constraint, the system evaluates the selected method’s theoretical complexity, the dataset’s dimensionality, and the currently available hardware \(e\.g\., GPU clusters\)\. It then provides a runtime estimate that can guide users to make informed decisions between accuracy and execution speed\. The motivation is that distinct real\-world applications might have different time constraint, highlighting a critical trade\-off: Sometimes a diagnosis or an anomaly in a manufacturing line have to be carried on very short timelines\. Other times, instead, a longer compute time is an acceptable price for higher accuracy\.

#### Security and Privacy \([Req\. 3](https://arxiv.org/html/2605.27022#S2.I1.i3)\)

To ensure secure handling of data, the system uses TLS\-encrypted communication between the user interface and backend services via authenticated APIs\. Access to datasets and outputs from Causal Discovery or Root Cause Analysis is controlled through role\-based access control \(RBAC\) integrated with the authentication and authorization layer\. Each interaction runs within an isolated session context to prevent cross\-user data leakage\.

### 4\.2User Interaction

Causal models have been historically built by bridging data with human expertise\. To facilitate this, ORCA offers interaction in two modalities: Natural language and Graphical User Interface \(GUI\)\.

The core of the interaction happens via a Natural Language Interface \(NLI\)\. In the NLI, the user can state the goal, steer ORCA, and inject prior knowledge\. Viceversa, the system can provide suggestions on the optimal pipeline and recommend algorithms\. To further facilitate the interaction, we complement with a GUI that permits to have more granular control and inspect results more easily\.

#### Knowledge and Data Ingestion \([Req\. 4](https://arxiv.org/html/2605.27022#S2.I1.i4)\)

Each session starts with the user providing data and the relevant context in natural language\. Users can upload observational/interventional datasets along with domain knowledge in varied forms \(e\.g\. manuals, technical reports, knowledge graphs, databases\)\. While the provided data is essential for the causal algorithms, the agents can use the additional domain knowledge to extract as much information as possible\. For example, this prior knowledge can be used to clarify the meaning of variables and remove ambiguity, or to extract specific causal relationships, or to absorb knowledge about diagnoses or anomalies that happened in the past\.

#### Interactive Orchestration \([Req\. 2](https://arxiv.org/html/2605.27022#S2.I1.i2)\)

Since causal inference based on purely observational data is rarely possible or statistically significant enough, the system strongly relies on prior knowledge and user validation\. In practice, ORCA employs a state management module that logs each step that has been performed along with its results\. This permits to track the progress in relation to the underlying objective defined by the user\. Consequently, in the user disagrees with the current pipeline, it can navigate back to previous steps to, for example, integrate new prior knowledge or execute a different algorithm without losing its global context\. The LLM continuously uses the session history to ensure that each reasoning step is logically consistent with past actions\.

#### Data Visualization and Report Generation \([Req\. 5](https://arxiv.org/html/2605.27022#S2.I1.i5)\)

Along the whole execution, the system offers the possibility to explore and visualize the data including descriptive statistics and exploratory data analysis \(EDA\)\. Finally, ORCA automatically synthesizes the whole causal analysis and uncovered causal phenomena into a report\. It compiles each decision, causal graph, and downstream method into a comprehensive and human\-readable report\. This final synthesis aims to remove complex algorithmic outputs and make the causal analysis accessible, reproducible, and interpretable to a broader audience\.

### 4\.3Concrete Features

#### SOTA Algorithms and Recommendations \([Req\. 6](https://arxiv.org/html/2605.27022#S2.I1.i6)\)\.

ORCA provides a comprehensive suite of SOTA tools and methods covering the entire causal analysis workflow, which we list in TableLABEL:tab:orca\_features\. ORCA automatically analyzes dataset characteristics to recommend the most appropriate algorithms\. Once selected, ORCA applies AutoML principles to search for the optimal method and associated hyperparameters to the specific task\. To ensure rigorous evaluation and reproducibility, the system also integrates a suite of standard metrics and off\-the\-shelf benchmarking datasets\.

#### Datasets and Benchmarks

To facilitate empirical evaluation, establish empirical baselines, and ensure reproducibility, ORCA natively implements a suite of standard benchmark datasets and data generation simulators\. Real\-world and benchmark datasets includeCausalChambers\[[39](https://arxiv.org/html/2605.27022#bib.bib39)\],CausalMan\[[40](https://arxiv.org/html/2605.27022#bib.bib40)\],Petshop\[[41](https://arxiv.org/html/2605.27022#bib.bib41)\],Sockshop\[[42](https://arxiv.org/html/2605.27022#bib.bib42)\], andProRCA\[[43](https://arxiv.org/html/2605.27022#bib.bib43)\]\. Furthermore, the system features simulations for random graphs, allowing users to generate Erdős–Rényi\[[44](https://arxiv.org/html/2605.27022#bib.bib44)\]and Scale\-Free\[[45](https://arxiv.org/html/2605.27022#bib.bib45)\]graphs of arbitrary size and functional form \(linear or non\-linear\)\. These simulators support additive noise drawn from Gaussian, Gumbel, or Uniform distributions\. To rigorously benchmark Root Cause Analysis and causal explainability, ORCA can systematically synthesize root causes by injecting hard, soft, single, or multiple interventions into the simulated data\.

Table 1\.Overview of ORCA Supported Features, Algorithms, and Metrics\.

ModuleSupported Methods & FeaturesMetrics / OutputPreprocessing & Data AnalysisData Cleaning & Preprocessing:Data type conformance checking, categorical data encoding\. Missing value detection, imputation, and dropping of sparse parameters/samples\. Unique value, parameter, and sample treatment\. Normalization \(zero\-mean unit\-variance, robust, min\-max scalers\)\.
Data Analysis:Descriptive statistics, bivariate and multivariate Exploratory Data Analysis \(EDA\), and data distributions\.Cleaned and encoded datasets, descriptive statistics, visualization plots\.Causal Discovery111Includes algorithms derived from the gcastle library \(https://github\.com/huawei\-noah/trustworthyAI\)\.Static Data:
Constraint\-based:PC\[[1](https://arxiv.org/html/2605.27022#bib.bib1)\], FCI\[[46](https://arxiv.org/html/2605.27022#bib.bib46)\]\.
Score\-based:GES\[[47](https://arxiv.org/html/2605.27022#bib.bib47)\], XGES\[[48](https://arxiv.org/html/2605.27022#bib.bib48)\], GRaSP\[[49](https://arxiv.org/html/2605.27022#bib.bib49)\]\.
Continuous:NOTEARS\[[50](https://arxiv.org/html/2605.27022#bib.bib50)\], GOLEM\[[51](https://arxiv.org/html/2605.27022#bib.bib51)\], CORL\[[52](https://arxiv.org/html/2605.27022#bib.bib52)\]\.
Functional/Hybrid:LiNGAM\[[53](https://arxiv.org/html/2605.27022#bib.bib53)\], ANM\[[54](https://arxiv.org/html/2605.27022#bib.bib54)\], PNL\[[55](https://arxiv.org/html/2605.27022#bib.bib55)\]\.
LLM\-based:CausalSteward\[[56](https://arxiv.org/html/2605.27022#bib.bib56)\]\.
Time Series Data:
Constraint\-based:PCMCI\[[57](https://arxiv.org/html/2605.27022#bib.bib57)\]\.
Granger Causality:NeuralGC\[[58](https://arxiv.org/html/2605.27022#bib.bib58)\]\.Structural Hamming Distance \(SHD\), normalized SHD\.Root Cause Analysis \(RCA\)222Includes algorithms derived from the RCA library \(https://github\.com/amazon\-science/RCAWithMissingStructuralKnowledgeCode\)\.Graph\-Required:Traversal\[[59](https://arxiv.org/html/2605.27022#bib.bib59)\], CI\-RCA\[[60](https://arxiv.org/html/2605.27022#bib.bib60)\], Counterfactual attribution\[[61](https://arxiv.org/html/2605.27022#bib.bib61)\], Score Traversal\[[17](https://arxiv.org/html/2605.27022#bib.bib17)\]\.
Graph\-Free:RCD\[[42](https://arxiv.org/html/2605.27022#bib.bib42)\], Cholesky Composition\[[7](https://arxiv.org/html/2605.27022#bib.bib7)\], Score Ordering\[[17](https://arxiv.org/html/2605.27022#bib.bib17)\]\.Precision, Recall, F1, Accuracy, NDCG, MRR, MAP@k\.

## 5Case Studies

To highlight the versatility of ORCA’s impact and showcase its intuitive usage, we present common problem statements on different domains and describe how ORCA guides the user through a full solution strategy, as well as providing the respective algorithm outcomes and visualizations of the strategy\.

1. 1\.Cloud Computing ProviderA cloud service provider operates complex applications composed of numerous microservices\. The system experiences an outage or a significant performance degradation, triggering alerts for multiple services\. These could be as example infrastructure failures, application\-level failures or external dependency failures\. The flood of alerts makes it difficult for the site reliability engineering team to distinguish between symptoms and actual root causes\. The provider wants to detect the root cause of the failure, which is the specific service or component that initiated the problem\. ORCA can rapidly and automatically identify the true root cause, pinpointing to the precise point of failure to resolve the issue and minimize down time\.
2. 2\.Retail:A clothing company is facing declining overall profitability due to rising material costs and shifting consumer behavior\. To counteract this trend, they need to increase their profit margins\. Upon analysis, they discover a significant and unexplained variance in profit margins across different product categories and sales channels\. The leadership team needs to understand the causal drivers of this variance to devise an effective strategy\. The company’s data analytics team has access to a massive dataset containing sales transactions, product costs, pricing information, and promotional discount data\. While they can see correlations, they cannot distinguish between cause and effect\. ORCA uses causal discovery to model the relationships between their key business levers\. Then it can be used to estimate the causal impact, e\.g\. how does a change in the discount\-rate effect its profit\-margin\. Validating different factors with this model, the company can decide on the most effective measures to maximize their profit\.
3. 3\.Manufacturing:A manufacturing company assembling magnetic valves and hydraulic blocks to hydraulic units is detecting increased leakage failures during pressure testing, leading to production yield decrease\. The process engineers are interested to find the cause of this leakage\. They consider partially known functional relationships in the system and let ORCA perform the Cholesky Composition on their data to get a ranked list of potential root\-causes\. Once tested for the top three provided estimates, they find the surface roughness variation in hydraulic blocks was causing seals to not fully conform to the surface, ultimately leading to the defects\.
4. 4\.Semiconductor Fabrication:A semiconductor fabrication plant \(fab\) manufactures complex integrated circuits on silicon wafers\. The production process is incredibly intricate, involving hundreds of discrete steps \(like etching, lithography, deposition, and cleaning\) that take place over several months\. Each production step is meticulously monitored, generating thousands of process parameters and sensor readings that are collected and stored\. The fab experiences a "yield excursion"—a sudden, unacceptable drop in the percentage of functional chips on its wafers\. Process engineers are now face with the task to find the root cause of the defects\. In this scenario, a swift response is critical to prevent high losses from scrapped wafers\. ORCA analyses the dataset to automatically detect the root cause of the yield drop\. The goal is to pinpoint the specific event or parameter drift that initiated the failure cascade\. For example, identifying a subtle, out\-of\-spec deviation that is causally linked to the defect\. For example, discovering that the plasma pressure in a specific etching tool was 0\.2% below its target range for a 30\-minute window two months ago\. By rapidly and automatically identifying the true root cause, the fab can immediately implement corrective actions to protect production yield\.

![Refer to caption](https://arxiv.org/html/2605.27022v1/Images/scenario3.png)\(a\)Manufacturing
![Refer to caption](https://arxiv.org/html/2605.27022v1/Images/scenario2.png)\(b\)Retail

Figure 3:Scenarios\. Example of ORCA functionalities in different use\-cases: a\) Manufacturing, and b\) Retail\.
## 6Conclusion

In this paper, we introduced ORCA, an LLM\-powered interactive copilot designed to enable causal analysis\. Through intuitive conversational interactions and a human\-in\-the\-loop workflow, users can navigate complex tasks, including causal discovery and Root Cause Analysis \(RCA\)\. The system orchestrates the best pipeline by proactively suggesting the next steps, recommending optimal state\-of\-the\-art algorithms based on dataset characteristics, and automating hyperparameter tuning\. By abstracting the steep methodological complexity of causal inference, we believe ORCA represents a first\-of\-its\-kind solution that empowers non\-experts to independently extract causal insights from real\-world data\.

## Declaration on Generative AI

During the preparation of this work, the author\(s\) used Gemini in order to: perform stylistic editing, summarize text, and check grammar and spelling\. After using this tool/service, the author\(s\) reviewed and edited the content as needed and take full responsibility for the publication’s content\.

## References

- Spirtes et al\. \[2000\]P\. Spirtes, C\. Glymour, R\. Scheines, Causation, Prediction, and Search, 2 ed\., MIT Press, Cambridge, MA, 2000\.
- Kellogg et al\. \[2017\]K\. M\. Kellogg, Z\. Hettinger, M\. Shah, R\. L\. Wears, C\. R\. Sellers, M\. Squires, R\. J\. Fairbanks,Our current approach to root cause analysis: is it contributing to our failure to improve patient safety?,BMJ Quality & Safety 26 \(2017\) 381–387\.
- Prosperi et al\. \[2020\]M\. C\. F\. Prosperi, Y\. Guo, M\. Sperrin, J\. S\. Koopman, J\. Min, X\. He, S\. N\. Rich, M\. Wang, I\. E\. Buchan, J\. Bian,Causal inference and counterfactual prediction in machine learning for actionable healthcare,Nature Machine Intelligence 2 \(2020\) 369 – 375\.
- Wu et al\. \[2008\]A\. W\. Wu, A\. K\. M\. Lipshutz, P\. J\. Pronovost,Effectiveness and efficiency of root cause analysis in medicine,Journal of the American Medical Association 299 \(2008\) 685–687\.
- Imbens \[2019\]G\. Imbens,Potential outcome and directed acyclic graph approaches to causality: Relevance for empirical practice in economics,NBER Working Paper Series \(2019\)\.
- Kleinberg and Hripcsak \[2011\]S\. Kleinberg, G\. Hripcsak,A review of causal inference for biomedical informatics,Journal of biomedical informatics 44 6 \(2011\) 1102–12\.
- Li et al\. \[2025\]J\. Li, B\. B\. Chu, I\. F\. Scheller, J\. Gagneur, M\. H\. Maathuis,Root cause discovery via permutations and cholesky decomposition,Journal of the Royal Statistical Society Series B: Statistical Methodology \(2025\)\.
- Glymour et al\. \[2019\]C\. Glymour, K\. Zhang, P\. Spirtes,Review of causal discovery methods based on graphical models,Frontiers in Genetics 10 \(2019\)\.
- e Oliveira et al\. \[2022\]E\. e Oliveira, V\. L\. Miguéis, J\. L\. Borges,Automatic root cause analysis in manufacturing: an overview & conceptualization,Journal of Intelligent Manufacturing 34 \(2022\) 2061–2078\.
- Papageorgiou et al\. \[2022\]K\. Papageorgiou, T\. Theodosiou, A\. Rapti, E\. I\. Papageorgiou, N\. Dimitriou, D\. Tzovaras, G\. Margetis,A systematic review on machine learning methods for root cause analysis towards zero\-defect manufacturing,Frontiers in Manufacturing Technology 2 \(2022\) 972712\.
- Solé et al\. \[2017\]M\. Solé, V\. Muntés\-Mulero, A\. I\. Rana, G\. Estrada,Survey on models and techniques for root\-cause analysis,arXiv:1701\.08546 \(2017\)\.[arXiv:1701\.08546](http://arxiv.org/abs/1701.08546)\.
- Soldani and Brogi \[2022\]J\. Soldani, A\. Brogi,Anomaly detection and failure root cause analysis in \(micro\) service\-based cloud applications: A survey,ACM Computing Surveys 55 \(2022\)\.
- Cochrane \[1972\]A\. L\. Cochrane,Effectiveness and efficiency: random reflections on health services \(1972\)\.
- Pearl \[2009\]J\. Pearl, Causality, 2 ed\., Cambridge University Press, 2009\.
- Cheng et al\. \[2022\]L\. Cheng, R\. Guo, R\. Moraffah, P\. Sheth, K\. S\. Candan, H\. Liu,Evaluation methods and measures for causal learning algorithms,IEEE Transactions on Artificial Intelligence 3 \(2022\) 924–943\.
- Gentzel et al\. \[2019\]A\. Gentzel, D\. Garant, D\. Jensen,The case for evaluating causal models using interventional measures and empirical data,in: H\. Wallach, H\. Larochelle, A\. Beygelzimer, F\. d'Alché\-Buc, E\. Fox, R\. Garnett \(Eds\.\), Advances in Neural Information Processing Systems, volume 32, Curran Associates, Inc\., 2019\.
- Orchard et al\. \[2025\]W\. R\. Orchard, N\. Okati, S\. H\. G\. Mejia, P\. Blöbaum, D\. Janzing,Root cause analysis of outliers with missing structural knowledge,in: The Annual Conference on Neural Information Processing Systems, 2025\.
- Saveliev et al\. \[2024\]E\. S\. Saveliev, T\. Schubert, T\. Pouplin, V\. Kosmoliaptsis, M\. van der Schaar,Climb: An ai\-enabled partner for clinical predictive modeling,ArXiv abs/2410\.03736 \(2024\)\.
- Vowels et al\. \[2021\]M\. J\. Vowels, N\. C\. Camgoz, R\. Bowden,D’ya like dags? a survey on structure learning and causal discovery,ACM Computing Surveys 55 \(2021\) 1 – 36\.
- Nogueira et al\. \[2022\]A\. R\. Nogueira, A\. Pugnana, S\. Ruggieri, D\. Pedreschi, J\. Gama,Methods and tools for causal discovery and causal inference,Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 12 \(2022\)\.
- Zecevic et al\. \[2023\]M\. Zecevic, M\. Willig, D\. S\. Dhami, K\. Kersting,Causal parrots: Large language models may talk causality but are not causal,ArXiv abs/2308\.13067 \(2023\)\.
- Jin et al\. \[2023\]Z\. Jin, J\. Liu, Z\. Lyu, S\. Poff, M\. Sachan, R\. Mihalcea, M\. T\. Diab, B\. Scholkopf,Can large language models infer causation from correlation?,ArXiv abs/2306\.05836 \(2023\)\.
- Kıcıman et al\. \[2023\]E\. Kıcıman, R\. O\. Ness, A\. Sharma, C\. Tan,Causal reasoning and large language models: Opening a new frontier for causality,ArXiv abs/2305\.00050 \(2023\)\.
- Jiralerspong et al\. \[2024\]T\. Jiralerspong, X\. Chen, Y\. More, V\. Shah, Y\. Bengio,Efficient causal graph discovery using large language models,ArXiv abs/2402\.01207 \(2024\)\.
- Hasan and Gani \[2024\]U\. Hasan, M\. O\. Gani,Optimizing data\-driven causal discovery using knowledge\-guided search,2024\. URL:[https://arxiv\.org/abs/2304\.05493](https://arxiv.org/abs/2304.05493)\.[arXiv:2304\.05493](http://arxiv.org/abs/2304.05493)\.
- Liu et al\. \[2024\]C\. Liu, Y\. Chen, T\. Liu, M\. Gong, J\. Cheng, B\. Han, K\. Zhang,Discovery of the hidden world with large language models,ArXiv abs/2402\.03941 \(2024\)\.
- Du et al\. \[2025\]H\. Du, Y\. Zheng, B\. Jing, Y\. Zhao, G\. Kou, G\. Liu, T\. Gu, W\. Li, C\. Yang,Causal discovery through synergizing large language model and data\-driven reasoning,in: Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V\.2, KDD ’25, Association for Computing Machinery, New York, NY, USA, 2025, p\. 543–554\. URL:[https://doi\.org/10\.1145/3711896\.3736874](https://doi.org/10.1145/3711896.3736874)\. doi:[10\.1145/3711896\.3736874](https://arxiv.org/doi.org/10.1145/3711896.3736874)\.
- Ban et al\. \[2025\]T\. Ban, L\. Chen, D\. Lyu, X\. Wang, Q\. Zhu, H\. Chen,Llm\-driven causal discovery via harmonized prior,IEEE Transactions on Knowledge and Data Engineering 37 \(2025\) 1943–1960\.
- Vashishtha et al\. \[2025\]A\. Vashishtha, A\. Kumar, A\. Pandey, A\. G\. Reddy, K\. Ahuja, V\. N\. Balasubramanian, A\. Sharma, Teaching transformers causal reasoning through axiomatic training, 2025\. URL:[https://arxiv\.org/abs/2407\.07612](https://arxiv.org/abs/2407.07612)\.[arXiv:2407\.07612](http://arxiv.org/abs/2407.07612)\.
- Antonucci et al\. \[2023\]A\. Antonucci, G\. Piqué, M\. Zaffalon, Zero\-shot causal graph extrapolation from text via llms, 2023\. URL:[https://arxiv\.org/abs/2312\.14670](https://arxiv.org/abs/2312.14670)\.[arXiv:2312\.14670](http://arxiv.org/abs/2312.14670)\.
- Saveliev et al\. \[2025\]E\. Saveliev, J\. Liu, N\. Seedat, A\. Boyd, M\. van der Schaar, Towards human\-guided, data\-centric llm co\-pilots, 2025\. URL:[https://arxiv\.org/abs/2501\.10321](https://arxiv.org/abs/2501.10321)\.[arXiv:2501\.10321](http://arxiv.org/abs/2501.10321)\.
- Berrevoets et al\. \[2025\]J\. Berrevoets, J\. Piskorz, R\. Davis, H\. Amad, J\. Weatherall, M\. van der Schaar, Technical report: Facilitating the adoption of causal inference methods through llm\-empowered co\-pilot, 2025\. URL:[https://arxiv\.org/abs/2508\.10581](https://arxiv.org/abs/2508.10581)\.[arXiv:2508\.10581](http://arxiv.org/abs/2508.10581)\.
- Wang et al\. \[2025\]X\. Wang, K\. Zhou, W\. Wu, H\. S\. Singh, F\. Nan, S\. Jin, A\. Philip, S\. Patnaik, H\. Zhu, S\. Singh, P\. P\. Prashant, Q\. Shen, B\. Huang,Causal\-copilot: An autonomous causal analysis agent,ArXiv abs/2504\.13263 \(2025\)\.
- Shan et al\. \[2025\]A\. Shan, J\. Kaur, R\. Singh, T\. Banka, R\. Yavatkar, T\. Sridhar,Rca copilot: Transforming network data into actionable insights via large language models,ICC 2025 \- IEEE International Conference on Communications \(2025\) 1566–1571\.
- Granger \[1969\]C\. W\. J\. Granger,Investigating causal relations by econometric models and cross\-spectral methods,1969\.
- Shyalika et al\. \[2025a\]C\. Shyalika, A\. Sharma, F\. E\. Kalach, U\. Jaimini, C\. Henson, R\. F\. Harik, A\. P\. Sheth,Causaltrace: A neurosymbolic causal analysis agent for smart manufacturing,ArXiv abs/2510\.12033 \(2025a\)\.
- Shyalika et al\. \[2025b\]C\. Shyalika, R\. Prasad, A\. T\. A\. Ghazo, D\. Eswaramoorthi, S\. S\. Muthuselvam, A\. P\. Sheth,Smartpilot: Agent\-based copilot for intelligent manufacturing,in: Adaptive Agents and Multi\-Agent Systems, 2025b\.
- Verma et al\. \[2025\]V\. Verma, S\. Acharya, S\. Simko, D\. Bhardwaj, A\. Haghighat, M\. Sachan, D\. Janzing, B\. Schölkopf, Z\. Jin,Causal ai scientist: Facilitating causal data science with large language models,2025\.
- Gamella et al\. \[2025\]J\. L\. Gamella, J\. Peters, P\. Bühlmann,Causal chambers as a real\-world physical testbed for AI methodology,Nature Machine Intelligence \(2025\)\.
- Tagliapietra et al\. \[2025\]N\. Tagliapietra, J\. Luettin, L\. Halilaj, M\. Willig, T\. Pychynski, K\. Kersting,Causalman: A physics\-based simulator for large\-scale causality,arXiv:2502\.12707 \(2025\)\.[arXiv:2502\.12707](http://arxiv.org/abs/2502.12707)\.
- Hardt et al\. \[2024\]M\. Hardt, W\. R\. Orchard, P\. Blöbaum, S\. Kasiviswanathan, E\. Kirschbaum, The petshop dataset – finding causes of performance issues across microservices, 2024\.[arXiv:2311\.04806](http://arxiv.org/abs/2311.04806)\.
- Ikram et al\. \[2022\]A\. Ikram, S\. Chakraborty, S\. Mitra, S\. K\. Saini, S\. Bagchi, M\. Kocaoglu,Root cause analysis of failures in microservices through causal discovery,in: Neural Inf\. Processing Systems, 2022\.
- Dawoud and Talupula \[2025\]A\. Dawoud, S\. Talupula,Prorca: A causal python package for actionable root cause analysis in real\-world business scenarios,arXiv:2503\.01475 \(2025\)\.[arXiv:2503\.01475](http://arxiv.org/abs/2503.01475)\.
- Erdös and Rényi \[1959\]P\. Erdös, A\. Rényi,On random graphs i,Publicationes Mathematicae Debrecen 6 \(1959\) 290\.
- Barabási and Albert \[1999\]A\.\-L\. Barabási, R\. Albert,Emergence of scaling in random networks,Science 286 \(1999\) 509–512\.
- Spirtes et al\. \[1995\]P\. Spirtes, C\. Meek, T\. S\. Richardson,Causal inference in the presence of latent variables and selection bias,in: Conference on Uncertainty in Artificial Intelligence, 1995\.
- Chickering \[2002\]D\. M\. Chickering,Optimal structure identification with greedy search,J\. Mach\. Learn\. Res\. 3 \(2002\) 507–554\.
- Nazaret and Blei \[2024\]A\. Nazaret, D\. Blei,Extremely greedy equivalence search,in: Proceedings of Conference on Uncertainty in Artificial Intelligence, 2024\.
- yin Lam et al\. \[2022\]W\. yin Lam, B\. Andrews, J\. Ramsey,Greedy relaxations of the sparsest permutation algorithm,ArXiv abs/2206\.05421 \(2022\)\.
- Zheng et al\. \[2018\]X\. Zheng, B\. Aragam, P\. K\. Ravikumar, E\. P\. Xing,Dags with no tears: Continuous optimization for structure learning,in: Advances in Neural Information Processing Systems, 2018\.
- Ng et al\. \[2021\]I\. Ng, A\. Ghassami, K\. Zhang, On the role of sparsity and dag constraints for learning linear dags, 2021\. URL:[https://arxiv\.org/abs/2006\.10201](https://arxiv.org/abs/2006.10201)\.[arXiv:2006\.10201](http://arxiv.org/abs/2006.10201)\.
- Wang et al\. \[2021\]X\. Wang, Y\. Du, S\. Zhu, L\. Ke, Z\. Chen, J\. Hao, J\. Wang,Ordering\-based causal discovery with reinforcement learning,in: International Joint Conference on Artificial Intelligence, 2021\.
- Shimizu et al\. \[2006\]S\. Shimizu, P\. O\. Hoyer, A\. Hyvarinen, A\. Kerminen,A linear non\-gaussian acyclic model for causal discovery,Journal of Machine Learning Research 7 \(2006\) 2003–2030\.
- Hoyer et al\. \[2008\]P\. O\. Hoyer, D\. Janzing, J\. M\. Mooij, J\. Peters, B\. Scholkopf,Nonlinear causal discovery with additive noise models,in: Neural Information Processing Systems, 2008\.
- Zhang and Hyvärinen \[2009\]K\. Zhang, A\. Hyvärinen,On the identifiability of the post\-nonlinear causal model,in: Conference on Uncertainty in Artificial Intelligence, 2009\.
- Tagliapietra et al\. \[2026\]N\. Tagliapietra, G\. L\. Marchioni, M\. Willig, J\. Luettin, L\. Halilaj, K\. Kersting, Causalsteward: An agentic divide\-conquer\-combine copilot for causal discovery, 2026\. URL:[https://openreview\.net/forum?id=3lFAyPa9Fe](https://openreview.net/forum?id=3lFAyPa9Fe)\.
- Runge et al\. \[2017\]J\. Runge, P\. J\. Nowack, M\. Kretschmer, S\. Flaxman, D\. Sejdinovic,Detecting and quantifying causal associations in large nonlinear time series datasets,Science Advances 5 \(2017\)\.
- Tank et al\. \[2021\]A\. Tank, I\. Covert, N\. J\. Foti, A\. Shojaie, E\. B\. Fox,Neural granger causality,IEEE Transactions on Pattern Analysis and Machine Intelligence 44 \(2021\) 4267–4279\.
- Liu et al\. \[2021\]D\. Liu, C\. He, X\. Peng, F\. Lin, C\. Zhang, S\. Gong, Z\. Li, J\. Ou, Z\. Wu,Microhecl: High\-efficient root cause localization in large\-scale microservice systems,in: IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice, 2021, pp\. 338–347\.
- Li et al\. \[2022\]M\. Li, Z\. Li, K\. Yin, X\. Nie, W\. Zhang, K\. Sui, D\. Pei,Causal inference\-based root cause analysis for online service systems with intervention recognition,Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining \(2022\)\.
- Budhathoki et al\. \[2022\]K\. Budhathoki, L\. Minorics, P\. Blöbaum, D\. Janzing,Causal structure\-based root cause analysis of outliers,in: International conference on machine learning, PMLR, 2022, pp\. 2357–2369\.
ORCA: An End-to-End Interactive Copilot for Optimized Root Cause Analysis

Similar Articles

Orc (working name) - auditable and declarative AI workflow

How Far Can Root Cause Analysis Go on Real-World Telemetry Data?

stablyai/orca

Shipped Orka, open-source control layer for AI agents in production.

OR-Space: A Full-Lifecycle Workspace Benchmark for Industrial Optimization Agents

Submit Feedback

Similar Articles

Orc (working name) - auditable and declarative AI workflow
How Far Can Root Cause Analysis Go on Real-World Telemetry Data?
Shipped Orka, open-source control layer for AI agents in production.
OR-Space: A Full-Lifecycle Workspace Benchmark for Industrial Optimization Agents