Tag
TopoEvo is a topology-aware self-evolving multi-agent framework for root cause analysis in microservices that couples graph representation learning with structured, topology-constrained reasoning. It achieves absolute improvements of up to 3.44% in root cause localization accuracy and boosts fault-type classification performance by 4.39% to 16.81% across diverse datasets.
STAR is a stage-attributed triage and repair framework that decomposes LLM-based RCA agent workflows into four structured stages, enabling stage-wise auditing, counterfactual evaluation, and patch-and-replay repair to improve root cause localization and fault type classification in microservice AIOps.
Anyscale published a technical guide on deploying production-ready AI agents using Ray Serve, MCP, and A2A protocols. The article addresses common infrastructure bottlenecks by proposing a decoupled microservices architecture that enables independent scaling of LLMs, tools, and agents.