Generalized Category Discovery in Federated Graph Learning

arXiv cs.LG 05/12/26, 04:00 AM Papers
Summary
This paper introduces GCD-FGL, a federated graph learning framework designed for generalized category discovery in dynamic environments. It addresses challenges like the neighborhood absorption effect and global semantic inconsistency to improve novel category detection across distributed clients.
arXiv:2605.08178v1 Announce Type: new Abstract: Federated Graph Learning (FGL) enables collaborative learning over distributed graph data, yet existing approaches largely rely on a closed-world assumption, limiting their applicability in dynamic environments where novel categories continuously emerge. To bridge this gap, we target the practical scenario of Federated Graph Generalized Category Discovery (FGGCD), aiming to collaboratively discover novel categories across decentralized graph clients while retaining knowledge of known categories. We observe that FGGCD introduces two fundamental challenges: (1) the Neighborhood Absorption Effect, where structural fragmentation leads to biased neighborhood aggregation, causing novel nodes to be misclassified as known categories; and (2) Global Semantic Inconsistency, where the aforementioned local biases propagate to the server and are amplified by heterogeneous subgraph distributions, hindering cross-client knowledge integration. To address these issues, we propose GCD-FGL, an FGL framework for GCD that integrates a client-side Topology-Reliable Semantic Alignment and Discovery process to mitigate the neighborhood absorption effect, and a server-side Hierarchical Prototype Alignment strategy to resolve global semantic inconsistency. Extensive experiments on five real-world graph datasets demonstrate that GCD-FGL consistently outperforms state-of-the-art baselines, achieving an average absolute gain of +4.86 in HRScore.
Original Article Export to Word Export to PDF
View Cached Full Text
Cached at: 05/12/26, 07:00 AM
# Generalized Category Discovery in Federated Graph Learning
Source: [https://arxiv.org/html/2605.08178](https://arxiv.org/html/2605.08178)
\\credit

Conceptualization, Methodology, Software, Data Curation, Investigation, Formal analysis, Writing – original draft

\\credit

Software, Writing – review & editing

\\credit

Methodology, Visualization, Supervision, Writing – review & editing

\\cormark

\[1\]\\creditSupervision

\\cormark

\[1\]\\creditSupervision

Lianshuai Guogoluis\.cc@gmail\.comXunkai Lics\.xunkai\.li@gmail\.comWenyu Wanghochi@sdu\.edu\.cnMeixia Qumxqu@sdu\.edu\.cnShandong University, School of Airspace Science and Engineering, Weihai 264209, ChinaBeijing Institute of Technology, School of Computer Science and Technology, Beijing 100081, China

###### Abstract

Federated Graph Learning \(FGL\) enables collaborative learning over distributed graph data, yet existing approaches largely rely on a closed\-world assumption, limiting their applicability in dynamic environments where novel categories continuously emerge\. To bridge this gap, we target the practical scenario of Federated Graph Generalized Category Discovery \(FGGCD\), aiming to collaboratively discover novel categories across decentralized graph clients while retaining knowledge of known categories\. We observe that FGGCD introduces two fundamental challenges: \(1\) the Neighborhood Absorption Effect, where structural fragmentation leads to biased neighborhood aggregation, causing novel nodes to be misclassified as known categories; and \(2\) Global Semantic Inconsistency, where the aforementioned local biases propagate to the server and are amplified by heterogeneous subgraph distributions, hindering cross\-client knowledge integration\. To address these issues, we propose GCD\-FGL, an FGL framework for GCD that integrates a client\-side Topology\-Reliable Semantic Alignment and Discovery process to mitigate the neighborhood absorption effect, and a server\-side Hierarchical Prototype Alignment strategy to resolve global semantic inconsistency\. Extensive experiments on five real\-world graph datasets demonstrate that GCD\-FGL consistently outperforms state\-of\-the\-art baselines, achieving an average absolute gain of \+4\.86 in HRScore\.

###### keywords:

Federated Graph Learning\\sepGeneralized Category Discovery\\sepOpen\-World Learning\\sepNon\-IID Data

## 1Introduction

Graph Neural Networks \(GNNs\)jiang2019semihave become a standard approach for modeling relational data, driving significant advancements across diverse domains including social networksguo2020deep, recommender systemsyu2022graph;he2023simplifying, and healthcareliu2022graphcdr;wang2023gadrp\. In many practical scenarios, however, graph data is naturally partitioned across multiple distributed data owners, a setting widely observed in applications such as financial and recommendation systems\. Due to strict privacy regulations, these data silos cannot be directly centralized\. Federated Graph Learning \(FGL\)ZZWLWWLWZ22offers a viable paradigm for collaborative graph model training under such privacy\-preserving constraints\.

Despite the remarkable success of FGL, existing paradigms largely rely on a closed\-world assumption\. This assumes that the label space encountered during training is identical to that during inference\. However, in many real\-world applications, models are deployed in highly dynamic, open\-world scenarios where novel categories continually emerge over time\. This fundamental gap between closed\-world training and open\-world realities restricts the adaptability of existing FGL models to unseen categories in the wild\.

![Refer to caption](https://arxiv.org/html/2605.08178v1/Federated_Absorption_Exacerbation.png)Figure 1:The neighborhood absorption effect under a global graph versus isolated subgraphs on Cora\. Isolated nodes belonging to novel categories \(y\-axis\) exhibit severe overconfident misclassification towards known categories \(x\-axis\)\.To take a step towards bridging this gap, Generalized Category Discovery \(GCD\)vaze2022generalizedhas been proposed as a critical component of open\-world learning\. Given an unlabeled dataset drawn from a mixed label space𝒴known∪𝒴novel\\mathcal\{Y\}\_\{known\}\\cup\\mathcal\{Y\}\_\{novel\}, GCD aims to simultaneously classify known categories and discover novel ones, leveraging knowledge from a labeled dataset defined solely on𝒴known\\mathcal\{Y\}\_\{known\}\. Unlike Out\-of\-Distribution \(OOD\) detectionhendrycks2017baseline, which simply rejects unfamiliar samples, or Novel Class Discovery \(NCD\)han2019learning, which assumes all unlabeled data belongs to𝒴novel\\mathcal\{Y\}\_\{novel\}, GCD operates under a much more realistic and challenging open\-world setting\. It requires the model to precisely distinguish between known and novel categories while concurrently clustering the unknowns into meaningful semantic categories\.

Recently, efforts have been made to extend GCD to complex data structures\. For instance, Graph Generalized Category Discovery \(GGCD\)deng2025towardshas emerged to tackle category discovery on graph data, assuming a holistic graph, while Federated GCDpu2024federatedhas been explored for federated learning but only in image data\. However, the intersection of GCD and FGL remains largely unexplored\. Federated Graph Generalized Category Discovery \(FGGCD\) aims to collaboratively discover novel categories across distributed graph silos\. Yet, simply applying existing GGCD methods to a federated environment fails to work effectively\. Specifically, extending GGCD to federated settings faces two primary challenges:

Challenge 1: Structural Truncation and the Neighborhood Absorption Effect\.Applying GCD to federated environments introduces a fundamental conflict: GCD requires a holistic data distribution to discover novel clusters, while federated settings physically isolate the graphzhang2023federated\. For a novel\-category nodevv\(yv∈𝒴novely\_\{v\}\\in\\mathcal\{Y\}\_\{novel\}\), this structural truncation deprives it of sufficient intra\-class support\. Consequently, its local observation is frequently dominated by known\-category neighbors\. During the GNN message\-passing process, the learned representationhvh\_\{v\}becomes biased toward these known\-category signals\. This representation collapse, characterized by theneighborhood absorptioneffect, leads to structural semantic dilution:P\(hv∣yv∈𝒴novel\)→P\(hv∣yv∈𝒴known\)P\(h\_\{v\}\\mid y\_\{v\}\\in\\mathcal\{Y\}\_\{novel\}\)\\to P\(h\_\{v\}\\mid y\_\{v\}\\in\\mathcal\{Y\}\_\{known\}\)\. To empirically validate this, we conduct a motivating study on the Cora dataset \(see Figure[1](https://arxiv.org/html/2605.08178#S1.F1)\) comparing representations trained on an intact global graph versus isolated subgraphs\. We observe that structural truncation induces an increase in the rate at which novel nodes are absorbed and misclassified into known categories\. This absorption degrades the structural semantics essential for GCD exploration, posing an obstacle to reliable novel cluster discovery\. To tackle this, we design a client\-side Topology\-Aware Graph Contrastive Flow module, as further detailed in Section[3\.2](https://arxiv.org/html/2605.08178#S3.SS2)\.

Challenge 2: Global Semantic Inconsistency Propagated by Local Collapse\.The local representation collapse outlined above propagates to the server, driving and amplifying semantic inconsistency at the global level\. Federated graphs inherently exhibit non\-IID \(non\-independent and identically distributed\) characteristicsli2024fedgta;zhu2024fedtaddriven by the homophily assumption, where nodes sharing similar labels concentrate within specific clients\. When local representations of novel nodes are distorted by their specific, skewed local known\-category neighbors \(Challenge 1\), the resulting local semantic spaces become largely client\-specific\. Because novel categories lack explicit supervision and are dynamically explored on local clients, this interplay between local representation offset and graph\-induced label skew means that novel clusters formed by different clients often possess divergent semantic meanings\. Consequently, accurately matching and aligning these isolated novel clusters at the server side becomes challenging\. This challenge is exacerbated in FGGCD scenarios where the total number of global categories is unknown\. Without a mechanism to reconcile these misaligned local semantic spaces, the framework is prone to global semantic inconsistency\. Within our GCD\-FGL framework, we resolve this misalignment through a server\-side Hierarchical Prototype Alignment strategy, as detailed in Section[3\.3](https://arxiv.org/html/2605.08178#S3.SS3)\.

To address these challenges, we propose GCD\-FGL, a federated framework tailored for FGGCD\. Specifically, to tackle Challenge 1 on the client side, we introduce a Topology\-Reliable Semantic Alignment and Discovery process guided by the Topology Reliability Guidance \(TRG\) mechanism\. This integration effectively calibrates isolated node representations, mitigating the local representation bias and the neighborhood absorption effect caused by structurally truncated subgraphs\. To address Challenge 2 on the server side, we couple a Hierarchical Prototype Alignment strategy\. This architecture harmonizes the global semantic space, effectively resolving the global semantic inconsistency propagated by local representation collapse across highly heterogeneous label spaces\.

The main contributions of this work are as follows:

New Observation\.To the best of our knowledge, this is the first work to pioneer and formalize the challenging problem of FGGCD, bridging the gap between GCD and FGL\.New Method\.We propose GCD\-FGL, a novel framework that integrates a client\-side Topology\-Reliable Semantic Alignment and Discovery process and a server\-side Hierarchical Prototype Alignment strategy, tailored specifically for FGGCD scenarios\.SOTA Performance\.We conduct extensive experiments on five real\-world graph benchmark datasets\. The results show that GCD\-FGL consistently outperforms state\-of\-the\-art baselines, achieving an average gain of \+4\.86 in HRScore\.

## 2Preliminaries and Related Work

### 2\.1Preliminaries and Problem Formulation

Table[1](https://arxiv.org/html/2605.08178#S2.T1)summarizes the main notations used in this paper\. We consider a global graph𝒢\\mathcal\{G\}partitioned intoNNisolated local subgraphs\{𝒢1,…,𝒢N\}\\\{\\mathcal\{G\}\_\{1\},\\dots,\\mathcal\{G\}\_\{N\}\\\}across distributed clients\. The global label space𝒴=𝒴known∪𝒴novel\\mathcal\{Y\}=\\mathcal\{Y\}\_\{known\}\\cup\\mathcal\{Y\}\_\{novel\}comprises disjoint known and novel categories\.

For each clientii, its local graph𝒢i\\mathcal\{G\}\_\{i\}consists of a labeled set𝒱iL\\mathcal\{V\}\_\{i\}^\{L\}\(whereyv∈𝒴knowny\_\{v\}\\in\\mathcal\{Y\}\_\{known\}\) and an unlabeled set𝒱iU\\mathcal\{V\}\_\{i\}^\{U\}\(whereyv∈𝒴known∪𝒴novely\_\{v\}\\in\\mathcal\{Y\}\_\{known\}\\cup\\mathcal\{Y\}\_\{novel\}\)\. Specifically, each client contains both labeled known categories and unlabeled samples\. Due to data heterogeneity, novel categories are not necessarily shared across clients and may be highly heterogeneous\. Under privacy\-protected constraints, cross\-client edges are unobservable, meaning the union of local edges is a strict subset of global edges \(⋃i=1Nℰi⊂ℰ\\bigcup\_\{i=1\}^\{N\}\\mathcal\{E\}\_\{i\}\\subset\\mathcal\{E\}\)\.

The primary goal of FGGCD is to collaboratively learn global parametersω\\omegathat minimize empirical risk across clients without sharing local data\. Ultimately, it seeks a unified representation space for both known\-category classification and dynamic novel\-category discovery, and this balance is quantitatively evaluated by the HRScore\. The global objective is formulated as:

minω∑i=1N\|𝒱i\|\|𝒱\|ℒtotal\(ω;𝒱iL,𝒢i\)\+ℒglobal\(ω\)\.\\min\_\{\\omega\}\\sum\_\{i=1\}^\{N\}\\frac\{\|\\mathcal\{V\}\_\{i\}\|\}\{\|\\mathcal\{V\}\|\}\\mathcal\{L\}\_\{total\}\(\\omega;\\mathcal\{V\}\_\{i\}^\{L\},\\mathcal\{G\}\_\{i\}\)\+\\mathcal\{L\}\_\{global\}\(\\omega\)\.\(1\)
whereℒtotal\(ω;𝒱iL,𝒢i\)\\mathcal\{L\}\_\{total\}\(\\omega;\\mathcal\{V\}\_\{i\}^\{L\},\\mathcal\{G\}\_\{i\}\)denotes the local optimization objective evaluated on clientiiover its local subgraph𝒢i\\mathcal\{G\}\_\{i\}, guided by the available labeled set𝒱iL\\mathcal\{V\}\_\{i\}^\{L\}\. To achieve a globally consistent semantic space despite data isolation,ℒglobal\(ω\)\\mathcal\{L\}\_\{global\}\(\\omega\)acts as a conceptual global alignment regularization term\. This abstract formulation establishes the theoretical goal of FGGCD; the specific realizations of the local discovery objectives and the global alignment mechanisms will be thoroughly detailed in Section[3\.2](https://arxiv.org/html/2605.08178#S3.SS2)and Section[3\.3](https://arxiv.org/html/2605.08178#S3.SS3)\.

Table 1:Summary of main notations used in GCD\-FGL\.NotationDescriptionGraph & Data Space𝒢\\mathcal\{G\},𝒢i\\mathcal\{G\}\_\{i\}Global graph / Local subgraph on clientii𝒱\\mathcal\{V\},ℰ\\mathcal\{E\}Global node set / Global edge set𝒱iL\\mathcal\{V\}\_\{i\}^\{L\},𝒱iU\\mathcal\{V\}\_\{i\}^\{U\}Labeled set / Unlabeled set on clientii𝒩i\(v\)\\mathcal\{N\}\_\{i\}\(v\),dvd\_\{v\}Local neighborhood / Local degree of nodevv𝒴known\\mathcal\{Y\}\_\{known\},𝒴novel\\mathcal\{Y\}\_\{novel\}Set of known categories / Set of novel categoriesFederated Setting & ReliabilityNN,ω\\omega,ωi\\omega\_\{i\}Total clients / Global model / Local modelSconf\(v\)S\_\{conf\}^\{\(v\)\},Shomo\(v\)S\_\{homo\}^\{\(v\)\}Predictive confidence / Structural smoothness ofvvwvw\_\{v\},wiw\_\{i\}TPR score of nodevv/ Average TPR of clientii𝐜i=\{ci,k\}\\mathbf\{c\}\_\{i\}=\\\{c\_\{i,k\}\\\}Set of local sample densities on clientiiMvM\_\{v\}Boolean mask for confident pseudo\-labelsServer & Prototype Management𝒫global\\mathcal\{P\}\_\{global\}Global prototype buffer at the serverPk\(t\)P\_\{k\}^\{\(t\)\}Global prototype for categorykkat roundtt𝒫iknown\\mathcal\{P\}\_\{i\}^\{known\},𝒫inovel\\mathcal\{P\}\_\{i\}^\{novel\}Local known prototypes / Local novel prototypes KlocalK\_\{local\}Relaxed local cluster count for prototype extractionvi,kv\_\{i,k\}Joint density\-reliability weight𝒳nov\\mathcal\{X\}\_\{nov\}Global discovery pool of unassigned sub\-prototypes𝒫cand\\mathcal\{P\}\_\{cand\},𝒫hist\\mathcal\{P\}\_\{hist\}Global candidate centers / Historical novel prototypesObjectives & Hyperparametersℒsup\\mathcal\{L\}\_\{sup\},ℒunsup\\mathcal\{L\}\_\{unsup\},ℒalign\\mathcal\{L\}\_\{align\}Supervised Flow / TPR\-Guided Unsup\. Flow /Dual\-flow Semantic Alignment Loss \(ℒsup\+ℒunsup\\mathcal\{L\}\_\{sup\}\+\\mathcal\{L\}\_\{unsup\}\)ℒgcl\\mathcal\{L\}\_\{gcl\}Topology\-Aware Graph Contrastive Flow lossτ\\tau,τsharp\\tau\_\{sharp\}Dynamic temperature / Sharpened temperatureθ∗\\theta^\{\*\},λhc\\lambda\_\{hc\}Optimal Threshold Cut / Hierarchical penaltyτbase\\tau\_\{base\},τdensity\\tau\_\{density\}Base matching threshold / Density filter thresholdAr,cA\_\{r,c\}Boolean assignment matrix for prototype matchingρ\\rhoEMA Momentum Update decay factor

### 2\.2Generalized Category Discovery

The ability to identify both known and novel categories in unlabeled data is crucial for open\-world learning\. Early paradigms addressing unseen categories primarily focused on OOD detectionhendrycks2017baseline, which merely rejects unfamiliar samples without further categorizing them, or NCDhan2019learning, which operates under the strict assumption that all unlabeled data belongs to the set of novel categories\. To relax strict NCD assumptions, AutoNovelhan2020autonovelproposed transferring knowledge from known to novel categories through self\-supervised representation learning, but it still struggled to handle realistic scenarios where unlabeled data contain a mixture of known and unknown categories\. To address this mixed\-data challenge, ORCAcao2022openintroduced an open\-world semi\-supervised learning framework using uncertainty modeling, but its complex multi\-stage objective often led to suboptimal feature alignment\. Subsequently, the pioneering work GCDvaze2022generalizedformally defined the GCD problem and established a strong baseline by using contrastive pre\-training followed by non\-parametricKK\-means clustering; however, this two\-stage pipeline fundamentally decoupled representation learning from cluster assignment\. To overcome this disconnect, SimGCDwen2023parametricintroduced a parametric framework that jointly optimizes representations and cluster prototypes end\-to\-end, significantly improving the accuracy of novel\-category discovery on image data\.

While GCD has been extended to federated settings via works like FedGCDpu2024federated, these advances are strictly confined to independent image data\. Existing methods assume either globally intact graphs or independent samples, and these assumptions collapse in FGL\. In FGL, geographically distributed yet interconnected data suffers severe structural truncation across client silos\. Consequently, discovering novel categories on decentralized graphs with heterogeneous label spaces and severed structural dependencies remains a formidable challenge\.

### 2\.3Federated Graph Learning

To collaboratively train GNNskipf2016semiover distributed subgraphs without centralizing raw data, FedAvgmcmahan2017communicationserves as the standard optimization strategy\. Specifically, the global aggregation at communication roundttis computed as:

ω~t=∑i=1N\|𝒱i\|\|𝒱\|ωit−1\.\\tilde\{\\omega\}^\{t\}=\\sum\_\{i=1\}^\{N\}\\frac\{\|\\mathcal\{V\}\_\{i\}\|\}\{\|\\mathcal\{V\}\|\}\\omega\_\{i\}^\{t\-1\}\.\(2\)
whereω~t\\tilde\{\\omega\}^\{t\}is the global model, andωit−1\\omega\_\{i\}^\{t\-1\}is the local model updated by clientiiin the previous round\. Upon receivingω~t\\tilde\{\\omega\}^\{t\}, clientiiperforms local updates via gradient descent:

ωit=ω~t−η∇f\(ω~t,𝒢i\)\.\\omega\_\{i\}^\{t\}=\\tilde\{\\omega\}^\{t\}\-\\eta\\nabla f\(\\tilde\{\\omega\}^\{t\},\\mathcal\{G\}\_\{i\}\)\.\(3\)
whereη\\etais the learning rate and∇f\(⋅\)\\nabla f\(\\cdot\)denotes the gradient of the Total Loss over the isolated subgraph𝒢i\\mathcal\{G\}\_\{i\}\.

While FedAvg supports isolated training, it performs poorly on non\-IID client data and struggles to capture the unique cross\-client dependencies of graph\-structured data\. To address this, FedSage\+zhang2021subgraphtrained a missing\-edge generator to reconstruct cross\-subgraph links, mitigating structural truncation, but it focuses mainly on structural completion\. Subsequently, FedPrototan2022fedprotointroduced prototype sharing to alleviate label distribution heterogeneity without transmitting model gradients, yet it largely overlooks topological structures\. To simultaneously capture structure and heterogeneity, FedPubbaek2023personalizedproposed functional embeddings for personalized subgraph training, and AdaFGLli2024adafglalong with FedGTAli2024fedgtaintegrated topology\-confidence strategies to improve structure\-aware aggregation\. More recently, FedTADzhu2024fedtademployed topology\-aware, data\-free distillation to achieve structure\-sensitive aggregation, thereby reducing reliance on raw feature sharing\.

Concurrently, standard FGL paradigms enforce a strict closed\-world assumption, rendering them incapable of discovering novel categories that dynamically emerge across isolated clients\. Integrating open\-world discovery into FGL is therefore essential for robust deployment in real\-world environments\. Our proposed GCD\-FGL framework is specifically designed to bridge this exact gap\.

## 3Methodology

![Refer to caption](https://arxiv.org/html/2605.08178v1/modelframework.png)Figure 2:Overview of the proposed GCD\-FGL framework\.### 3\.1Overview of the GCD\-FGL Framework

The core principle of GCD\-FGL is to treat global prototypessnell2017prototypicalas the primary semantic bridge across clients, progressively refining these representations through reliability\-aware local extraction and hierarchical global alignment\. As illustrated in Figure[2](https://arxiv.org/html/2605.08178#S3.F2), unlike standard FGL frameworksmcmahan2017communicationthat are sensitive to subgraph distribution discrepancy and heterogeneous label spaces, GCD\-FGL is driven by two systematically designed structural components:

Topology\-Reliable Semantic Alignment and Discovery \(Client Side\):To mitigate the neighborhood absorption effect, we introduce a comprehensive alignment and discovery process guided by the TRG mechanism\. This process filters noisy edges, directly mitigating the neighborhood absorption effect and preventing representation collapse for novel categories\.

Hierarchical Prototype Alignment \(Server Side\):To resolve global semantic inconsistency across non\-IID clients, the server performs Hierarchical Prototype Alignment to harmonize the global semantic space\. By incorporating a Density\-TPR weighting mechanism into the aggregation process, Hierarchical Prototype Alignment unifies decentralized knowledge and dynamically identifies novel categories without predefining the cluster count \(KK\), effectively preventing amplified catastrophic forgetting\.

Collaborative Optimization Protocol\.The complete GCD\-FGL training paradigm iteratively executes local topology\-aware extraction and global hierarchical alignment\. More specifically, the corresponding client\-side local update procedure is detailed in Algorithm[1](https://arxiv.org/html/2605.08178#alg1), while the overarching server\-side aggregation and category discovery workflow is subsequently summarized in Algorithm[2](https://arxiv.org/html/2605.08178#alg2)\.

Algorithm 1GCD\-FGL\-Client1:Client ID

ii, global prototypes

𝒫global\\mathcal\{P\}\_\{global\}, global model

ω\(t−1\)\\omega^\{\(t\-1\)\}, local epochs

EE
2:Initialize local model

ωi←ω\(t−1\)\\omega\_\{i\}\\leftarrow\\omega^\{\(t\-1\)\}
3:for

e=1,…,Ee=1,\.\.\.,Edo

4:Compute node embeddings

Z~\\tilde\{Z\}via local GNN

5:TPR Estimation:

6:Compute TPR score

wvw\_\{v\}for

v∈𝒱iv\\in\\mathcal\{V\}\_\{i\}via Eq\.[6](https://arxiv.org/html/2605.08178#S3.E6)

7:Reliability\-Aware Semantic Alignment:

8:Update

ωi\\omega\_\{i\}by optimizing

ℒtotal=ℒalign\+βℒgcl\\mathcal\{L\}\_\{total\}=\\mathcal\{L\}\_\{align\}\+\\beta\\mathcal\{L\}\_\{gcl\}via Eq\.[12](https://arxiv.org/html/2605.08178#S3.E12)

9:endfor

10:Local Category Discovery & Extraction:

11:Extract local prototypes

𝒫iknown\\mathcal\{P\}\_\{i\}^\{known\}and

𝒫inovel\\mathcal\{P\}\_\{i\}^\{novel\}via clustering

12:Compute average TPR

wiw\_\{i\}and the set of local sample densities

𝐜i=\{ci,k\}\\mathbf\{c\}\_\{i\}=\\\{c\_\{i,k\}\\\}
13:return

\{ωi,𝒫i,wi,𝐜i\}\\\{\\omega\_\{i\},\\mathcal\{P\}\_\{i\},w\_\{i\},\\mathbf\{c\}\_\{i\}\\\}, where

𝒫i=𝒫iknown∪𝒫inovel\\mathcal\{P\}\_\{i\}=\\mathcal\{P\}\_\{i\}^\{known\}\\cup\\mathcal\{P\}\_\{i\}^\{novel\}

Algorithm 2GCD\-FGL\-Server1:rounds

RR, momentum

ρ\\rho, hierarchical penalty

λhc\\lambda\_\{hc\}
2:foreach round

t=1,…,Rt=1,\.\.\.,Rdo

3:Sample a subset of clients

StS\_\{t\}
4:foreach client

i∈Sti\\in S\_\{t\}in paralleldo

5:

\{ωi,𝒫i,wi,𝐜i\}←Client\(i,𝒫global,ω\(t−1\)\)\\\{\\omega\_\{i\},\\mathcal\{P\}\_\{i\},w\_\{i\},\\mathbf\{c\}\_\{i\}\\\}\\leftarrow\\textsc\{Client\}\(i,\\mathcal\{P\}\_\{global\},\\omega^\{\(t\-1\)\}\)
6:endfor

7:Step 1: Density\-Aware Prototype Aggregation

8:Update global model via Eq\.[2](https://arxiv.org/html/2605.08178#S2.E2)

9:Compute updated global prototypes

Pk\(t\)P\_\{k\}^\{\(t\)\}via Eq\.[14](https://arxiv.org/html/2605.08178#S3.E14)

10:Step 2: Hierarchical Prototype Alignment

11:Decompose each

𝒫i\\mathcal\{P\}\_\{i\}into

𝒫iknown\\mathcal\{P\}\_\{i\}^\{known\}and

𝒫inovel\\mathcal\{P\}\_\{i\}^\{novel\}based on global known prototypes

12:Form global discovery pool

𝒳nov=⋃i∈St𝒫inovel\\mathcal\{X\}\_\{nov\}=\\bigcup\_\{i\\in S\_\{t\}\}\\mathcal\{P\}\_\{i\}^\{novel\}
13:Extract candidate novel centers

𝒫cand\\mathcal\{P\}\_\{cand\}via Eq\.[15](https://arxiv.org/html/2605.08178#S3.E15)

14:Step 3 & 4: EMA Update & Global Memory Routing

15:Update memory

𝒫global\\mathcal\{P\}\_\{global\}with

𝒫cand\\mathcal\{P\}\_\{cand\}via Eq\.[17](https://arxiv.org/html/2605.08178#S3.E17)

16:Broadcast

ω\(t\)\\omega^\{\(t\)\}and

𝒫global\\mathcal\{P\}\_\{global\}to clients

StS\_\{t\}
17:endfor

### 3\.2Client Side: Topology\-Reliable Semantic Alignment and Discovery

Motivation\.Structural truncation in isolated subgraphs exacerbates the neighborhood absorption effect, thereby inducing representation collapse for novel categories\.Solution\.To address this issue, we propose a Topology\-Reliable Semantic Alignment and Discovery process, which utilizes theToPologyReliability \(TPR\) metric to evaluate node trustworthiness to filter heterophilous noise\.

TPR Estimation\.For a given nodev∈𝒱iv\\in\\mathcal\{V\}\_\{i\}on clientii, its predictive confidenceSconf\(v\)S\_\{conf\}^\{\(v\)\}\(distinguishing between states of High Confidence and Low Confidence\) is derived from the normalized Shannon entropy over the predicted probability distributionpv,cp\_\{v,c\}, a standard proxy for model uncertaintygrandvalet2005semi:

Sconf\(v\)=1−−∑c=1Cpv,clog⁡\(pv,c\+ϵ\)max⁡\(log⁡C,ϵ\)\.S\_\{conf\}^\{\(v\)\}=1\-\\frac\{\-\\sum\_\{c=1\}^\{C\}p\_\{v,c\}\\log\(p\_\{v,c\}\+\\epsilon\)\}\{\\max\(\\log C,\\epsilon\)\}\.\(4\)
Concurrently, the structural smoothnessShomo\(v\)S\_\{homo\}^\{\(v\)\}measures the semantic consistency between nodevvand its immediate local neighborhood𝒩i\(v\)\\mathcal\{N\}\_\{i\}\(v\):

Shomo\(v\)=1max⁡\(1,dv\)∑u∈𝒩i\(v\)max⁡\(0,z~v⊤z~u\)\.S\_\{homo\}^\{\(v\)\}=\\frac\{1\}\{\\max\(1,d\_\{v\}\)\}\\sum\_\{u\\in\\mathcal\{N\}\_\{i\}\(v\)\}\\max\(0,\\tilde\{z\}\_\{v\}^\{\\top\}\\tilde\{z\}\_\{u\}\)\.\(5\)
wheredvd\_\{v\}is the local degree\. Themax⁡\(0,⋅\)\\max\(0,\\cdot\)operation explicitly functions to filter negative heterophilous edges\.

Intuition\.The final TPR score is designed as a strict logical conjunction to filter heterophilous noise, ensuring that a node influences subsequent alignments only if it simultaneously exhibits high predictive certainty \(SconfS\_\{conf\}\) and local semantic homophily \(ShomoS\_\{homo\}\)\.

wv=Shomo\(v\)⋅Sconf\(v\)\.w\_\{v\}=S\_\{homo\}^\{\(v\)\}\\cdot S\_\{conf\}^\{\(v\)\}\.\(6\)
Topology\-Aware Graph Contrastive Flow\.To capture intrinsic graph topology and mitigate representation collapse in structurally truncated regionschuang2020debiased, we employ a semantic graph contrastive loss \(ℒgcl\\mathcal\{L\}\_\{gcl\}\)zhu2020deep\. Specifically, letMv∈\{0,1\}M\_\{v\}\\in\\\{0,1\\\}denote a boolean mask indicating confident assignment \(Mv≡1M\_\{v\}\\equiv 1for labeled nodes\), andy~v\\tilde\{y\}\_\{v\}denote the semantic label, which is the ground\-truthyvy\_\{v\}for labeled nodes or the assigned pseudo\-labely^v\\hat\{y\}\_\{v\}for unlabeled nodes\. The adjusted positive similarityspos\(v,u\)s\_\{pos\}\(v,u\)between connected nodes\(v,u\)∈ℰi\(v,u\)\\in\\mathcal\{E\}\_\{i\}is formulated as follows:

spos\(v,u\)=\{10−8,ifMv∧Muandy~v≠y~uexp⁡\(z~v⊤z~uτ\),otherwise\.s\_\{pos\}\(v,u\)=\\begin\{cases\}10^\{\-8\},&\\text\{if \}M\_\{v\}\\land M\_\{u\}\\text\{ and \}\\tilde\{y\}\_\{v\}\\neq\\tilde\{y\}\_\{u\}\\\\ \\exp\\left\(\\frac\{\\tilde\{z\}\_\{v\}^\{\\top\}\\tilde\{z\}\_\{u\}\}\{\\tau\}\\right\),&\\text\{otherwise\}\\end\{cases\}\.\(7\)
where setting the similarity to10−810^\{\-8\}avoids numerical instability in subsequent logarithmic calculations\. The overall contrastive loss, which is symmetric for negative samples, is defined as follows\.

Intuition\.Unlike conventional contrastive objectives that uniformly pull all connected nodes together, this topology\-aware loss employs the false\-positive truncation mechanism \(sposs\_\{pos\}\) to nullify the pull of confident heterophilous edges\.

ℒgcl=1\|𝒱i\|∑v∈𝒱i1dv∑u∈𝒩i\(v\)−log⁡\(spos\(v,u\)spos\(v,u\)\+∑u′sneg\(v,u′\)\)\.\\begin\{split\}\\mathcal\{L\}\_\{gcl\}=&\\frac\{1\}\{\|\\mathcal\{V\}\_\{i\}\|\}\\sum\_\{v\\in\\mathcal\{V\}\_\{i\}\}\\frac\{1\}\{d\_\{v\}\}\\sum\_\{u\\in\\mathcal\{N\}\_\{i\}\(v\)\}\\\\ &\-\\log\\left\(\\frac\{s\_\{pos\}\(v,u\)\}\{s\_\{pos\}\(v,u\)\+\\sum\_\{u^\{\\prime\}\}s\_\{neg\}\(v,u^\{\\prime\}\)\}\\right\)\.\\end\{split\}\(8\)
wheresneg\(v,u′\)=exp⁡\(z~v⊤z~u′τ\)s\_\{neg\}\(v,u^\{\\prime\}\)=\\exp\\left\(\\frac\{\\tilde\{z\}\_\{v\}^\{\\top\}\\tilde\{z\}\_\{u^\{\\prime\}\}\}\{\\tau\}\\right\)for negative samplesu′∉𝒩i\(v\)u^\{\\prime\}\\notin\\mathcal\{N\}\_\{i\}\(v\), following the standard InfoNCE formulation\.

Supervised Flow\.For nodes with ground\-truth labels𝒱iL\\mathcal\{V\}\_\{i\}^\{L\}, the supervised flow employs a cross\-entropy objective to align local embeddings against the global prototype buffer𝒫global\\mathcal\{P\}\_\{global\}broadcasted by the server:

ℒsup=1\|𝒱iL\|∑v∈𝒱iLCE\(z~v𝒫global⊤τ,yv\)\.\\mathcal\{L\}\_\{sup\}=\\frac\{1\}\{\|\\mathcal\{V\}\_\{i\}^\{L\}\|\}\\sum\_\{v\\in\\mathcal\{V\}\_\{i\}^\{L\}\}\\text\{CE\}\\left\(\\frac\{\\tilde\{z\}\_\{v\}\\mathcal\{P\}\_\{global\}^\{\\top\}\}\{\\tau\},y\_\{v\}\\right\)\.\(9\)
whereCE\(⋅,⋅\)\\text\{CE\}\(\\cdot,\\cdot\)denotes the cross\-entropy loss applied to the softmax\-normalized logits\.

TPR\-Guided Unsupervised Flow\.For unlabeled nodes𝒱iU\\mathcal\{V\}\_\{i\}^\{U\}, we generate pseudo\-labelsy^v\\hat\{y\}\_\{v\}using a sharpened temperatureτsharp\\tau\_\{sharp\}over the currently available global prototypes in𝒫global\\mathcal\{P\}\_\{global\}\. To filter out noisy assignments, a dynamic thresholdγ\\gammais calculated based on batch confidence statistics:

γ=max⁡\(0\.5,μq\+α⋅σq\)\.\\gamma=\\max\(0\.5,\\mu\_\{q\}\+\\alpha\\cdot\\sigma\_\{q\}\)\.\(10\)
whereμq\\mu\_\{q\}andσq\\sigma\_\{q\}are the mean and standard deviation of prediction confidences within the current batch\. Unlike fixed thresholds that may discard too many samples early in trainingsohn2020fixmatch, this batch\-adaptive formulation dynamically scales with the model’s learning pace\. The boolean mask is formally defined asMv=𝕀\(qv\>γ\)M\_\{v\}=\\mathbb\{I\}\(q\_\{v\}\>\\gamma\)to retain only reliable predictions\.

Intuition\.To robustly align with the global prototype, this loss employs two safety mechanisms: the adaptive mask \(MvM\_\{v\}\) maintains a steady learning curriculum, while the TPR score \(wvw\_\{v\}\) acts as a soft weight to suppress gradients from surviving heterophilous noise\.

ℒunsup=1\|𝒱iU\|∑v∈𝒱iU\[wv⋅Mv⋅CE\(z~v𝒫global⊤τ,y^v\)\]\.\\mathcal\{L\}\_\{unsup\}=\\frac\{1\}\{\|\\mathcal\{V\}\_\{i\}^\{U\}\|\}\\sum\_\{v\\in\\mathcal\{V\}\_\{i\}^\{U\}\}\\left\[w\_\{v\}\\cdot M\_\{v\}\\cdot\\text\{CE\}\\left\(\\frac\{\\tilde\{z\}\_\{v\}\\mathcal\{P\}\_\{global\}^\{\\top\}\}\{\\tau\},\\hat\{y\}\_\{v\}\\right\)\\right\]\.\(11\)
Total Loss\.The Total Loss optimized iteratively by clientiibalances supervised discrimination, unsupervised alignment, and structural consistency:

ℒtotal=ℒsup\+ℒunsup⏟ℒalign\+βℒgcl\.\\mathcal\{L\}\_\{total\}=\\underbrace\{\\mathcal\{L\}\_\{sup\}\+\\mathcal\{L\}\_\{unsup\}\}\_\{\\mathcal\{L\}\_\{align\}\}\+\\beta\\mathcal\{L\}\_\{gcl\}\.\(12\)
Upon convergence, the client extracts known \(𝒫iknown\\mathcal\{P\}\_\{i\}^\{known\}\) and novel \(𝒫inovel\\mathcal\{P\}\_\{i\}^\{novel\}\) prototypes and their sample densities via local K\-means on unlabeled data, sending them to the server\.

Local Category Discovery\.In the GCD setting, the true number of novel categories is unknown\. Instead of a predefinedKK, we employ relaxed over\-clustering and density\-filtering\. For the local K\-means, the relaxed local cluster countKlocalK\_\{local\}is dynamically upper\-bounded by the unlabeled graph volume to prevent over\-fragmentation:

Klocal=max⁡\(2,min⁡\(Kmax,⌊\|𝒱iU\|3⌋\)\)\.K\_\{local\}=\\max\\left\(2,\\min\\left\(K\_\{max\},\\lfloor\\frac\{\|\\mathcal\{V\}\_\{i\}^\{U\}\|\}\{3\}\\rfloor\\right\)\\right\)\.\(13\)
whereKmaxK\_\{max\}is an empirical upper bound for the expected number of novel categories\. Equation[13](https://arxiv.org/html/2605.08178#S3.E13)serves as a relaxed upper bound to encourage exploration\. Consequently, in edge cases, such as a very small\|𝒱iU\|\|\\mathcal\{V\}\_\{i\}^\{U\}\|or when unlabeled nodes exclusively belong to known categories, this relaxation might propose spurious boundaries\. However, the subsequent density filters out these false novel prototypes by accepting a local clusterCkC\_\{k\}only if it satisfies a minimum sample density constraint\|Ck\|\>τdensity\|C\_\{k\}\|\>\\tau\_\{density\}withτdensity=5\\tau\_\{density\}=5\. Finally, server\-side hierarchical clustering structurally corrects any surviving spurious prototypes to mitigate false category generation\.

### 3\.3Server Side: Hierarchical Category Discovery

Motivation\.The local representation collapse often propagates to the server, interacting with label skew to drive and amplify global semantic inconsistency across clients\.Solution\.We design a Hierarchical Prototype Alignment strategy to identify novel category candidates and maintain memory stability\.

The server performs a progressive semantic association process, where reliable local discoveries are first aggregated and subsequently structurally grouped, before being finally stabilized through temporal memory\.

Step 1: Density\-Aware Aggregation\.To mitigate label skew, the server computes the updated global prototypePk\(t\)P\_\{k\}^\{\(t\)\}using a joint weightvi,k=wi,k⋅ci,kv\_\{i,k\}=w\_\{i,k\}\\cdot c\_\{i,k\}\(wherewi,kw\_\{i,k\}represents the cluster\-specific average TPR score for local clusterkkandci,k∈𝐜ic\_\{i,k\}\\in\\mathbf\{c\}\_\{i\}represents the local Sample Density for categorykk\) within the central Aggregator:

Pk\(t\)=Normalize\(∑ivi,kPi,k∑ivi,k\+ϵ\)\.P\_\{k\}^\{\(t\)\}=\\text\{Normalize\}\\left\(\\frac\{\\sum\_\{i\}v\_\{i,k\}P\_\{i,k\}\}\{\\sum\_\{i\}v\_\{i,k\}\+\\epsilon\}\\right\)\.\(14\)
This joint weighting mitigates weight\-shifting from non\-IID distributions, preventing noisy clients from dominating the global space\. Here,Pi,k∈𝒫iP\_\{i,k\}\\in\\mathcal\{P\}\_\{i\}denotes the local prototype for categorykkreceived from clientii\.

Step 2: Hierarchical Cluster for Server Category Discovery\.Local known category prototypes𝒫iknown\\mathcal\{P\}\_\{i\}^\{known\}are absorbed during Step 1 aggregation\. Therefore, only the unabsorbed local novel prototypes𝒫inovel\\mathcal\{P\}\_\{i\}^\{novel\}are injected into the global discovery pool𝒳nov=⋃i∈St𝒫inovel\\mathcal\{X\}\_\{nov\}=\\bigcup\_\{i\\in S\_\{t\}\}\\mathcal\{P\}\_\{i\}^\{novel\}\. To extract global candidate centers𝒫cand\\mathcal\{P\}\_\{cand\}for novel categories, we introduce a penalized Silhouette scorerousseeuw1987silhouettesto find the optimal Threshold Cutθ∗\\theta^\{\*\}on the Constrained Hierarchical Clustering Dendrogram\.

Intuition\.Explicitly penalizing the cluster count enforces compact, meaningful semantic groupings and rejects fragmented boundaries from isolated clients\.

θ∗=arg⁡maxθ⁡\(Ssil\(θ\)−λhc⋅Ω\(θ\)\)\.\\theta^\{\*\}=\\arg\\max\_\{\\theta\}\\left\(S\_\{sil\}\(\\theta\)\-\\lambda\_\{hc\}\\cdot\\Omega\(\\theta\)\\right\)\.\(15\)
whereΩ\(θ\)=max⁡\(0,Nclust\(θ\)−\(\|𝒴known\|\+2\)\)\\Omega\(\\theta\)=\\max\(0,N\_\{clust\}\(\\theta\)\-\(\|\\mathcal\{Y\}\_\{known\}\|\+2\)\)represents the hierarchical penalty, andλhc\\lambda\_\{hc\}is the regularization factor\.

Step 3: Prototype Matching\.To match candidate centers𝒫cand\\mathcal\{P\}\_\{cand\}with historical prototypes𝒫hist\(t−1\)\\mathcal\{P\}\_\{hist\}^\{\(t\-1\)\}, we compute the cosine similarity matrixS∈ℝ\|𝒫cand\|×\|𝒫hist\|S\\in\\mathbb\{R\}^\{\|\\mathcal\{P\}\_\{cand\}\|\\times\|\\mathcal\{P\}\_\{hist\}\|\}, whereSr,c=cos⁡\(Pcand,r,Phist,c\(t−1\)\)S\_\{r,c\}=\\cos\(P\_\{cand,r\},P\_\{hist,c\}^\{\(t\-1\)\}\)\. The optimal assignment is solved via the Hungarian algorithm to maximize the total similarity:

arg⁡maxA∑r,cAr,c⋅Sr,c,s\.t\.∑rAr,c≤1,∑cAr,c≤1\.\\arg\\max\_\{A\}\\sum\_\{r,c\}A\_\{r,c\}\\cdot S\_\{r,c\},\\quad\\text\{s\.t\.\}\\sum\_\{r\}A\_\{r,c\}\\leq 1,\\sum\_\{c\}A\_\{r,c\}\\leq 1\.\(16\)
whereAr,c∈\{0,1\}A\_\{r,c\}\\in\\\{0,1\\\}is the boolean assignment matrix\. To reject mismatches, a match is considered valid only if the similarity exceeds a dynamic threshold:Sr,c\>max⁡\(τbase,S¯\)S\_\{r,c\}\>\\max\(\\tau\_\{base\},\\bar\{S\}\), whereS¯\\bar\{S\}is the mean similarity of the current matrix, andτbase\\tau\_\{base\}is the base semantic similarity threshold empirically set to 0\.3 across all datasets\.

Step 4: EMA Update & Memory Routing\.Valid matched pairs undergo an EMA update for temporal smoothing with the corresponding historical prototypePhist,c\(t−1\)P\_\{hist,c\}^\{\(t\-1\)\}, controlled by a momentum factorρ\\rho:

Phist,c\(t\)=Normalize\(ρ⋅Phist,c\(t−1\)\+\(1−ρ\)⋅Pcand,r\)\.P\_\{hist,c\}^\{\(t\)\}=\\text\{Normalize\}\\left\(\\rho\\cdot P\_\{hist,c\}^\{\(t\-1\)\}\+\(1\-\\rho\)\\cdot P\_\{cand,r\}\\right\)\.\(17\)
whererris the specific candidate index matched to the historical prototypeccvia the aforementioned assignment matrixAA\. Unmatched candidates are registered as new additions to the global prototype buffer𝒫global\\mathcal\{P\}\_\{global\}, while unmatched historical prototypes are preserved to prevent forgetting\.

## 4Experiments

Table 2:Statistical information and GCD split protocol of the experimental datasets\.DatasetDescriptionNodesEdgesFeaturescategoriesTrain/Val/Test\*TotalKnown / NovelCoraCitation network2,7085,4291,43374 / 320% / 40% / 40%CiteSeerCitation network3,3274,7323,70363 / 320% / 40% / 40%Amazon PhotoCo\-purchase graph7,487119,04374584 / 420% / 40% / 40%Amazon ComputersCo\-purchase graph13,381245,778767105 / 520% / 40% / 40%Coauthor CSCo\-authorship graph18,33381,8946,805158 / 720% / 40% / 40%

\*Note: The 20%/40%/40% \(Train/Val/Test\) split applies only to Known categories; Novel categories remain entirely unlabeled\.

In this section, we conduct extensive experiments to evaluate the effectiveness of the proposed GCD\-FGL framework\. We first introduce the graph datasets, describe the GCD scenarios, and outline the baselines and evaluation metrics\. Subsequently, we aim to answer the following research questions:Q1 \(Performance Comparison\):Compared with other state\-of\-the\-art baselines, can GCD\-FGL achieve better performance in GCD scenarios?Q2 \(Ablation Study\):Where does the performance gain of GCD\-FGL come from?Q3 \(Hyperparameter Sensitivity\):How sensitive is GCD\-FGL to the variations of key hyperparameters?Q4 \(Robustness\):How robust is the proposed GCD\-FGL method?Q5 \(Efficiency\):How does GCD\-FGL perform in terms of training efficiency and computational cost?

### 4\.1Experiments Setup

Datasets and GCD setup\.To evaluate the effectiveness of GCD\-FGL, we conduct experiments on five widely used graph benchmark datasets: Cora, CiteSeer, Amazon Photo, Amazon Computers, and Coauthor CS\. To simulate a realistic federated GCD environment, we employ the Louvain algorithmblondel2008fastto partition the global graph topology into 10 clients\.

As summarized in Table[2](https://arxiv.org/html/2605.08178#S4.T2), our proposed two\-stage GCD split protocol operates as follows\. First, in theGlobal Category Split, we partition the global label space into disjoint Known and Novel categories\. For datasets with an odd number of total categories\|𝒴\|\|\\mathcal\{Y\}\|, the number of Known categories is set to⌈\|𝒴\|/2⌉\\lceil\|\\mathcal\{Y\}\|/2\\rceil\. Second, in theLocal Data Maskingstage, we construct the dataset masks within each client’s isolated subgraph under a transductive setting\. Specifically, for Known categories, 20% of the instances are sampled to form the labeled training set, 40% are allocated to the validation set, and the remaining 40% are retained as unlabeled data\. These unlabeled Known instances are then combined with 100% of the Novel category instances to concurrently form the unlabeled training set and the testing set, requiring the model to discover new categories across all local unlabeled data\.

Baselines\.We evaluate GCD\-FGL against representative state\-of\-the\-art baselines, which are broadly categorized into three groups: adapted computer vision GCD methods, adapted graph GCD methods, and dedicated federated GCD methods\. For the adapted CV GCD methods, we select AutoNovelhan2020autonovel, ORCAcao2022open, GCDvaze2022generalized, and SimGCDwen2023parametric\. Following standard adaptation protocols, we extend these centralized models to the federated graph setting by equipping them with a GCN backbonekipf2016semiand employing standard FedAvgmcmahan2017communicationfor server\-side aggregation\. For the adapted graph GCD methods, we evaluate SWIRLdeng2025towards, a state\-of\-the\-art approach tailored for holistic GGCD, which is similarly extended to the federated setting via FedAvg\. Finally, to ensure a comprehensive comparison against architectures explicitly designed for distributed scenarios, we benchmark against FedGCDpu2024federated, a recent framework formulated specifically for federated generalized category discovery\.

Evaluation Metrics\.Following standard GCD evaluation protocols, we adopt a comprehensive metric system\. Since the cluster assignments of novel categories are permutation\-invariant, the Hungarian algorithmkuhn1955hungarianis applied to find the optimal matching between the predicted cluster assignments and the ground\-truth labels before computing accuracy\. In line with the realistic GCD setting, the total number of clusters \(KK\) is not provided as an oracle prior\. The evaluated metrics include New Acc \(accuracy on novel nodes\), Old Acc \(accuracy on known nodes\), All Acc \(overall accuracy on the entire test set\), and HRScore\. It is important to clarify that All Acc is evaluated independently over the entire test set and is not a direct mathematical derivation of the Old and New accuracies\. Specifically, in alignment with recent Graph GCD studies like SWIRLdeng2025towards, HRScore is defined as the harmonic mean of Old Acc and New Acc \(2×\(Old Acc×New Acc\)/\(Old Acc\+New Acc\)2\\times\(\\text\{Old Acc\}\\times\\text\{New Acc\}\)/\(\\text\{Old Acc\}\+\\text\{New Acc\}\)\)\. Because relying solely on overall accuracy, known\-category accuracy, or novel\-category accuracy in isolation cannot reflect a model’s comprehensive capability, we adopt HRScore as the primary evaluation metric to quantify the balance between old\-knowledge retention and new\-category exploration\.

Implementation Details\.All experiments and federated framework simulations are implemented based on the OpenFGL libraryli2024openfgl\. We consistently utilize a 2\-layer GCNkipf2016semias the backbone architecture\. Models are optimized using the Adam optimizerkingma2014adamwith an initial global learning rate of 0\.001 and a weight decay of 5e\-4, while the temperature parameter for contrastive learning is set to 0\.1\. For baseline comparisons, all methods are executed using the default hyperparameters specified in their original papers\. Since this work is the first to explore GCD tasks under a federated graph learning challenge, there are no existing Fed\-GCD\-specific methods for direct comparison\. To adapt existing centralized and non\-graph GCD algorithms to the federated environment, we implement a standard federated baseline protocol\. For visual baselines, we replace image augmentations with standard graph contrastive learning via edge dropping and feature maskingzhu2020deep\. Specifically, during training, all baseline models employ FedAvg for model weight aggregation\. During evaluation, to enable cross\-client category discovery and ensure a fair comparison against our framework, we implement a naive global prototype aggregation mechanism on the server\. This mechanism aligns local prototypes from isolated clients to global prototypes via Hungarian matching based on cosine distance, followed by simple averaging\. This uniform baseline adaptation ensures that all models are evaluated under the exact same federated prototype\-matching conditions\. All experiments are conducted on an Ubuntu 22\.04 workstation equipped with an Intel Core i9\-13900K processor, an NVIDIA GeForce RTX 3090 GPU \(24GB\), CUDA 12\.1, and 64GB of RAM\. To ensure the reliability of our findings, we report the mean performance and standard deviation across multiple independent runs\.

### 4\.2Performance Comparison

Table 3:Performance comparison across five graph datasets\. The best results are highlighted in bold, and the second\-best results are underlined\. The ‘Gain‘ column shows the improvement between GCD\-FGL and the best baseline\.DatasetsMetricsAutoNovelORCAGCDSimGCDFedGCDSWIRLGCD\-FGLGainCoraHRScore36\.98±0\.2136\.98\_\{\\scriptstyle\\,\\pm\\,0\.21\}34\.40±0\.5334\.40\_\{\\scriptstyle\\,\\pm\\,0\.53\}37\.75±3\.5937\.75\_\{\\scriptstyle\\,\\pm\\,3\.59\}32\.99±0\.1732\.99\_\{\\scriptstyle\\,\\pm\\,0\.17\}37\.44±6\.7937\.44\_\{\\scriptstyle\\,\\pm\\,6\.79\}48\.96¯±0\.86\\underline\{48\.96\}\_\{\\scriptstyle\\,\\pm\\,0\.86\}52\.85±1\.24\\mathbf\{52\.85\_\{\\scriptstyle\\,\\pm\\,1\.24\}\}\+3\.89Old58\.01¯±0\.57\\underline\{58\.01\}\_\{\\scriptstyle\\,\\pm\\,0\.57\}41\.67±0\.2641\.67\_\{\\scriptstyle\\,\\pm\\,0\.26\}42\.63±8\.8842\.63\_\{\\scriptstyle\\,\\pm\\,8\.88\}54\.17±0\.2654\.17\_\{\\scriptstyle\\,\\pm\\,0\.26\}52\.24±16\.1752\.24\_\{\\scriptstyle\\,\\pm\\,16\.17\}51\.60±1\.7251\.60\_\{\\scriptstyle\\,\\pm\\,1\.72\}66\.67±1\.44\\mathbf\{66\.67\_\{\\scriptstyle\\,\\pm\\,1\.44\}\}\+8\.66New27\.14±0\.1927\.14\_\{\\scriptstyle\\,\\pm\\,0\.19\}29\.29±0\.7629\.29\_\{\\scriptstyle\\,\\pm\\,0\.76\}33\.88±1\.4133\.88\_\{\\scriptstyle\\,\\pm\\,1\.41\}23\.72±0\.1723\.72\_\{\\scriptstyle\\,\\pm\\,0\.17\}29\.18±6\.5329\.18\_\{\\scriptstyle\\,\\pm\\,6\.53\}46\.58±0\.66\\mathbf\{46\.58\_\{\\scriptstyle\\,\\pm\\,0\.66\}\}43\.78¯±1\.58\\underline\{43\.78\}\_\{\\scriptstyle\\,\\pm\\,1\.58\}\-2\.80All31\.38±0\.5731\.38\_\{\\scriptstyle\\,\\pm\\,0\.57\}30\.99±0\.6530\.99\_\{\\scriptstyle\\,\\pm\\,0\.65\}35\.08±1\.7235\.08\_\{\\scriptstyle\\,\\pm\\,1\.72\}27\.90±0\.1427\.90\_\{\\scriptstyle\\,\\pm\\,0\.14\}32\.35±3\.4532\.35\_\{\\scriptstyle\\,\\pm\\,3\.45\}46\.27¯±0\.81\\underline\{46\.27\}\_\{\\scriptstyle\\,\\pm\\,0\.81\}47\.92±1\.28\\mathbf\{47\.92\_\{\\scriptstyle\\,\\pm\\,1\.28\}\}\+1\.65CiteSeerHRScore44\.62±0\.5744\.62\_\{\\scriptstyle\\,\\pm\\,0\.57\}49\.01±0\.3149\.01\_\{\\scriptstyle\\,\\pm\\,0\.31\}38\.81±1\.8138\.81\_\{\\scriptstyle\\,\\pm\\,1\.81\}40\.78±0\.4440\.78\_\{\\scriptstyle\\,\\pm\\,0\.44\}32\.01±1\.6432\.01\_\{\\scriptstyle\\,\\pm\\,1\.64\}57\.16¯±1\.57\\underline\{57\.16\}\_\{\\scriptstyle\\,\\pm\\,1\.57\}59\.78±1\.08\\mathbf\{59\.78\_\{\\scriptstyle\\,\\pm\\,1\.08\}\}\+2\.62Old52\.78±0\.8352\.78\_\{\\scriptstyle\\,\\pm\\,0\.83\}66\.07±0\.40\\mathbf\{66\.07\_\{\\scriptstyle\\,\\pm\\,0\.40\}\}38\.42±0\.6338\.42\_\{\\scriptstyle\\,\\pm\\,0\.63\}37\.70±0\.7537\.70\_\{\\scriptstyle\\,\\pm\\,0\.75\}59\.07±1\.8759\.07\_\{\\scriptstyle\\,\\pm\\,1\.87\}55\.66±2\.2155\.66\_\{\\scriptstyle\\,\\pm\\,2\.21\}60\.73¯±2\.09\\underline\{60\.73\}\_\{\\scriptstyle\\,\\pm\\,2\.09\}\-5\.34New38\.64±0\.7338\.64\_\{\\scriptstyle\\,\\pm\\,0\.73\}38\.95±0\.3738\.95\_\{\\scriptstyle\\,\\pm\\,0\.37\}39\.20±3\.6339\.20\_\{\\scriptstyle\\,\\pm\\,3\.63\}44\.41±0\.1544\.41\_\{\\scriptstyle\\,\\pm\\,0\.15\}21\.95±1\.5221\.95\_\{\\scriptstyle\\,\\pm\\,1\.52\}58\.75¯±2\.21\\underline\{58\.75\}\_\{\\scriptstyle\\,\\pm\\,2\.21\}58\.85±0\.72\\mathbf\{58\.85\_\{\\scriptstyle\\,\\pm\\,0\.72\}\}\+0\.10All41\.77±0\.3041\.77\_\{\\scriptstyle\\,\\pm\\,0\.30\}44\.95±0\.2044\.95\_\{\\scriptstyle\\,\\pm\\,0\.20\}39\.03±2\.7239\.03\_\{\\scriptstyle\\,\\pm\\,2\.72\}42\.93±0\.2042\.93\_\{\\scriptstyle\\,\\pm\\,0\.20\}30\.17±1\.0330\.17\_\{\\scriptstyle\\,\\pm\\,1\.03\}58\.07¯±1\.24\\underline\{58\.07\}\_\{\\scriptstyle\\,\\pm\\,1\.24\}59\.94±0\.81\\mathbf\{59\.94\_\{\\scriptstyle\\,\\pm\\,0\.81\}\}\+1\.87CSHRScore57\.77±2\.3757\.77\_\{\\scriptstyle\\,\\pm\\,2\.37\}59\.34±8\.2859\.34\_\{\\scriptstyle\\,\\pm\\,8\.28\}56\.01±2\.3456\.01\_\{\\scriptstyle\\,\\pm\\,2\.34\}54\.53±0\.1454\.53\_\{\\scriptstyle\\,\\pm\\,0\.14\}53\.63±17\.4553\.63\_\{\\scriptstyle\\,\\pm\\,17\.45\}63\.94¯±3\.22\\underline\{63\.94\}\_\{\\scriptstyle\\,\\pm\\,3\.22\}72\.30±6\.63\\mathbf\{72\.30\_\{\\scriptstyle\\,\\pm\\,6\.63\}\}\+8\.36Old62\.06±5\.3862\.06\_\{\\scriptstyle\\,\\pm\\,5\.38\}79\.87±28\.2179\.87\_\{\\scriptstyle\\,\\pm\\,28\.21\}67\.74±6\.0167\.74\_\{\\scriptstyle\\,\\pm\\,6\.01\}56\.37±0\.1256\.37\_\{\\scriptstyle\\,\\pm\\,0\.12\}41\.31±20\.6541\.31\_\{\\scriptstyle\\,\\pm\\,20\.65\}82\.19¯±10\.39\\underline\{82\.19\}\_\{\\scriptstyle\\,\\pm\\,10\.39\}82\.38±17\.01\\mathbf\{82\.38\_\{\\scriptstyle\\,\\pm\\,17\.01\}\}\+0\.19New54\.04±0\.7754\.04\_\{\\scriptstyle\\,\\pm\\,0\.77\}47\.21±3\.5647\.21\_\{\\scriptstyle\\,\\pm\\,3\.56\}47\.75±1\.6347\.75\_\{\\scriptstyle\\,\\pm\\,1\.63\}52\.80±0\.2452\.80\_\{\\scriptstyle\\,\\pm\\,0\.24\}76\.41±5\.41\\mathbf\{76\.41\_\{\\scriptstyle\\,\\pm\\,5\.41\}\}52\.32±0\.9652\.32\_\{\\scriptstyle\\,\\pm\\,0\.96\}64\.41¯±2\.48\\underline\{64\.41\}\_\{\\scriptstyle\\,\\pm\\,2\.48\}\-12\.00All55\.15±0\.3355\.15\_\{\\scriptstyle\\,\\pm\\,0\.33\}51\.74±0\.9451\.74\_\{\\scriptstyle\\,\\pm\\,0\.94\}50\.52±1\.3750\.52\_\{\\scriptstyle\\,\\pm\\,1\.37\}53\.30±0\.2353\.30\_\{\\scriptstyle\\,\\pm\\,0\.23\}66\.55¯±3\.80\\underline\{66\.55\}\_\{\\scriptstyle\\,\\pm\\,3\.80\}56\.45±0\.7356\.45\_\{\\scriptstyle\\,\\pm\\,0\.73\}71\.03±0\.33\\mathbf\{71\.03\_\{\\scriptstyle\\,\\pm\\,0\.33\}\}\+4\.48PhotoHRScore48\.49±0\.5548\.49\_\{\\scriptstyle\\,\\pm\\,0\.55\}42\.54±4\.3242\.54\_\{\\scriptstyle\\,\\pm\\,4\.32\}47\.66±3\.7747\.66\_\{\\scriptstyle\\,\\pm\\,3\.77\}37\.68±0\.6937\.68\_\{\\scriptstyle\\,\\pm\\,0\.69\}45\.79±5\.2145\.79\_\{\\scriptstyle\\,\\pm\\,5\.21\}48\.61¯±2\.86\\underline\{48\.61\}\_\{\\scriptstyle\\,\\pm\\,2\.86\}51\.66±2\.14\\mathbf\{51\.66\_\{\\scriptstyle\\,\\pm\\,2\.14\}\}\+3\.05Old45\.78±0\.6845\.78\_\{\\scriptstyle\\,\\pm\\,0\.68\}38\.35±6\.8838\.35\_\{\\scriptstyle\\,\\pm\\,6\.88\}43\.30±5\.3443\.30\_\{\\scriptstyle\\,\\pm\\,5\.34\}33\.78±0\.8033\.78\_\{\\scriptstyle\\,\\pm\\,0\.80\}42\.22±8\.0342\.22\_\{\\scriptstyle\\,\\pm\\,8\.03\}62\.27¯±5\.04\\underline\{62\.27\}\_\{\\scriptstyle\\,\\pm\\,5\.04\}62\.29±4\.30\\mathbf\{62\.29\_\{\\scriptstyle\\,\\pm\\,4\.30\}\}\+0\.02New51\.53¯±0\.91\\underline\{51\.53\}\_\{\\scriptstyle\\,\\pm\\,0\.91\}47\.75±2\.1747\.75\_\{\\scriptstyle\\,\\pm\\,2\.17\}53\.00±4\.77\\mathbf\{53\.00\_\{\\scriptstyle\\,\\pm\\,4\.77\}\}42\.61±1\.2242\.61\_\{\\scriptstyle\\,\\pm\\,1\.22\}50\.01±5\.2550\.01\_\{\\scriptstyle\\,\\pm\\,5\.25\}39\.87±3\.2439\.87\_\{\\scriptstyle\\,\\pm\\,3\.24\}44\.13±2\.2544\.13\_\{\\scriptstyle\\,\\pm\\,2\.25\}\-8\.87All49\.83±0\.8449\.83\_\{\\scriptstyle\\,\\pm\\,0\.84\}44\.97±0\.9744\.97\_\{\\scriptstyle\\,\\pm\\,0\.97\}50\.13¯±3\.21\\underline\{50\.13\}\_\{\\scriptstyle\\,\\pm\\,3\.21\}40\.00±0\.8940\.00\_\{\\scriptstyle\\,\\pm\\,0\.89\}47\.71±2\.9247\.71\_\{\\scriptstyle\\,\\pm\\,2\.92\}46\.49±2\.4046\.49\_\{\\scriptstyle\\,\\pm\\,2\.40\}50\.19±1\.26\\mathbf\{50\.19\_\{\\scriptstyle\\,\\pm\\,1\.26\}\}\+0\.06ComputersHRScore42\.17±4\.7042\.17\_\{\\scriptstyle\\,\\pm\\,4\.70\}33\.77±5\.5333\.77\_\{\\scriptstyle\\,\\pm\\,5\.53\}46\.21±2\.3246\.21\_\{\\scriptstyle\\,\\pm\\,2\.32\}48\.26¯±1\.24\\underline\{48\.26\}\_\{\\scriptstyle\\,\\pm\\,1\.24\}46\.07±3\.0546\.07\_\{\\scriptstyle\\,\\pm\\,3\.05\}40\.87±0\.2340\.87\_\{\\scriptstyle\\,\\pm\\,0\.23\}54\.62±4\.49\\mathbf\{54\.62\_\{\\scriptstyle\\,\\pm\\,4\.49\}\}\+6\.36Old56\.10¯±2\.74\\underline\{56\.10\}\_\{\\scriptstyle\\,\\pm\\,2\.74\}52\.75±4\.5152\.75\_\{\\scriptstyle\\,\\pm\\,4\.51\}45\.61±3\.8445\.61\_\{\\scriptstyle\\,\\pm\\,3\.84\}51\.70±1\.6851\.70\_\{\\scriptstyle\\,\\pm\\,1\.68\}45\.02±3\.0645\.02\_\{\\scriptstyle\\,\\pm\\,3\.06\}35\.02±0\.2835\.02\_\{\\scriptstyle\\,\\pm\\,0\.28\}58\.39±6\.83\\mathbf\{58\.39\_\{\\scriptstyle\\,\\pm\\,6\.83\}\}\+2\.29New33\.78±5\.9533\.78\_\{\\scriptstyle\\,\\pm\\,5\.95\}24\.83±5\.9024\.83\_\{\\scriptstyle\\,\\pm\\,5\.90\}46\.83±2\.5246\.83\_\{\\scriptstyle\\,\\pm\\,2\.52\}45\.25±1\.6745\.25\_\{\\scriptstyle\\,\\pm\\,1\.67\}47\.17±5\.4347\.17\_\{\\scriptstyle\\,\\pm\\,5\.43\}49\.08¯±0\.36\\underline\{49\.08\}\_\{\\scriptstyle\\,\\pm\\,0\.36\}51\.30±5\.86\\mathbf\{51\.30\_\{\\scriptstyle\\,\\pm\\,5\.86\}\}\+2\.22All45\.36±1\.8245\.36\_\{\\scriptstyle\\,\\pm\\,1\.82\}39\.31±2\.3339\.31\_\{\\scriptstyle\\,\\pm\\,2\.33\}46\.20±1\.5546\.20\_\{\\scriptstyle\\,\\pm\\,1\.55\}51\.50¯±0\.39\\underline\{51\.50\}\_\{\\scriptstyle\\,\\pm\\,0\.39\}46\.05±3\.3046\.05\_\{\\scriptstyle\\,\\pm\\,3\.30\}41\.79±0\.5241\.79\_\{\\scriptstyle\\,\\pm\\,0\.52\}53\.03±3\.02\\mathbf\{53\.03\_\{\\scriptstyle\\,\\pm\\,3\.02\}\}\+1\.53

To answerQ1, we evaluate the performance of GCD\-FGL against baseline models across five benchmark graph datasets under the distributed federated setting \(partitioned via the Louvain algorithm\)\. Detailed results are presented in Table[3](https://arxiv.org/html/2605.08178#S4.T3), with the best results highlighted in bold\. Measured by the HRScore, which quantifies the balance between known category retention and novel category discovery, our proposed method consistently outperforms the baselines across all five datasets\. Most notably, GCD\-FGL achieves an average gain of \+4\.86 points in HRScore over the strongest competitors and secures the highest overall accuracy \(All Acc\) across all benchmarks\. For instance, on the Coauthor CS dataset, GCD\-FGL yields an HRScore of 72\.30%, an improvement of 8\.36% over the SWIRL baseline \(63\.94%\)\.

While GCD\-FGL achieves state\-of\-the\-art comprehensive performance, it occasionally yields to specific baselines on isolated metrics\. For instance, on the Cora dataset, SWIRL achieves a marginally higher New Acc \(46\.58%\) compared to GCD\-FGL \(43\.78%\)\. However, in FGGCD settings, aggressively optimizing for novel category discovery often induces feature space squeezing, inherently compromising known category retention\. Consequently, SWIRL’s slight gain in New Acc comes at the expense of its Old Acc \(51\.60%, compared to GCD\-FGL’s 66\.67%\)\. This trade\-off emphasizes the critical importance of utilizing HRScore as the primary evaluation metric, as it demonstrates that GCD\-FGL achieves a superior harmonic balance without suffering from catastrophic forgetting of existing knowledge\.

The results also reveal performance fluctuations in existing baselines, particularly concerning known category retention \(Old Acc\)\. For instance, FedGCD \(on the CS dataset\) and SWIRL \(on the Computers dataset\) experience notable accuracy drops on known categories, occasionally falling below their respective performance on novel categories\. This vulnerability stems from their respective architectural limitations\. Although SWIRL is tailored for GGCD, it lacks mechanisms to address the subgraph distribution discrepancy and heterogeneous label spaces inherent in federated settings, thereby hindering its generalization across distributed clients\. Conversely, while FedGCD explicitly accounts for federated heterogeneity, its core logic, originally designed for visual tasks, largely overlooks complex topological dependencies\. Consequently, it is susceptible to structural truncation, limiting its convergence and representation quality, particularly on sparse datasets such as Cora and CiteSeer\.

### 4\.3Method Interpretability

To answerQ2, we conduct an ablation study to systematically isolate and evaluate the individual contributions of the core components within GCD\-FGL: the TPR\-Guided Unsupervised Flow \(ℒunsup\\mathcal\{L\}\_\{unsup\}\), the Topology\-Aware Graph Contrastive Flow \(ℒgcl\\mathcal\{L\}\_\{gcl\}\), and the TRG mechanism\. This evaluation is performed under a standard distributed setting \(partitioned via the Louvain algorithm\)\. Quantitative results across all five datasets are comprehensively detailed in Figure[3](https://arxiv.org/html/2605.08178#S4.F3), with the corresponding final t\-SNE feature distributions on the CiteSeer dataset further visualized in Figure[4](https://arxiv.org/html/2605.08178#S4.F4)\.

Each variant deactivates a specific mechanism corresponding to our core components to validate its necessity and effect on the overall model:

- •w/oℒgcl\\mathcal\{L\}\_\{gcl\}:Removes the Topology\-Aware Graph Contrastive Flow, evaluating its role as a structural false\-positive truncation mechanism and its preservation of local structural information under heterophily\.
- •w/oℒunsup\\mathcal\{L\}\_\{unsup\}:Removes the TPR\-Guided Unsupervised Alignment Flow, examining the impact of the global semantic bridge on dynamically discovering novel categories and capturing cross\-client semantic consistency among unlabeled nodes\.
- •w/o TRG:Removes the TRG module by setting the TPR soft weights \(wvw\_\{v\}\) to a uniform value in the unsupervised flow, assessing its mitigation of the neighborhood absorption effect by filtering heterophilous noise during prototype alignment\.

![Refer to caption](https://arxiv.org/html/2605.08178v1/ablation_error_bars_themed.png)Figure 3:Quantitative ablation study across all five datasets\. The error bars indicate performance variance\.![Refer to caption](https://arxiv.org/html/2605.08178v1/global_tsne_CiteSeer.png)Figure 4:t\-SNE visualization of node feature distributions for different ablated variants on the CiteSeer dataset\.First, removing the Topology\-Aware Graph Contrastive Flow \(w/oℒgcl\\mathcal\{L\}\_\{gcl\}\) degrades both performance and representation quality\. As observed in Figure[4](https://arxiv.org/html/2605.08178#S4.F4), the absence ofℒgcl\\mathcal\{L\}\_\{gcl\}causes node embeddings to merge into a continuous manifold lacking distinct decision boundaries\. Algorithmically,ℒgcl\\mathcal\{L\}\_\{gcl\}introduces a false\-positive truncation mechanism that pulls homophilous neighbors together while pushing heterophilous ones apart, forcing representations to respect both semantic boundaries and intrinsic topological structures\. Without this guidance, the GNN backbone becomes highly susceptible to neighborhood absorption and over\-smoothing\. This diminishes intra\-class distinctiveness and leads to performance variance, such as the37\.21±17\.9137\.21\\pm 17\.91variance on Cora demonstrated by the large error bars in Figure[3](https://arxiv.org/html/2605.08178#S4.F3)\.

Second, when the TPR estimation or the unsupervised alignment flow is disabled \(w/o TRGorw/oℒunsup\\mathcal\{L\}\_\{unsup\}\), Figure[4](https://arxiv.org/html/2605.08178#S4.F4)illustrates that the embedding space becomes fragmented, failing to form compact semantic clusters\. The notable performance drop in thew/o TRGvariant highlights the necessity of dynamically filtering noisy node\-to\-prototype assignments\. While the TPR metric quantifies node trustworthiness via a logical conjunction of predictive confidence and structural smoothness, disabling TRG forces the unsupervised flow to assign a uniform soft weight to all pseudo\-labels\. This uniform weighting amplifies unreliable heterophilous signals, exacerbating the representation bias towards inaccurate local prototypes\. Similarly, removing the unsupervised flow \(w/oℒunsup\\mathcal\{L\}\_\{unsup\}\) deprives the unlabeled nodes of the semantic bridge provided by global prototypes, hindering the model’s capacity to reliably cluster dynamically discovered novel categories\.

Finally, as demonstrated by the Ours variant in Figure[4](https://arxiv.org/html/2605.08178#S4.F4), the complete GCD\-FGL framework integrates these components effectively\. By leveraging the TRG mechanism to filter heterophilous noise and dynamically combining the contrastive flow \(ℒgcl\\mathcal\{L\}\_\{gcl\}\) with the unsupervised alignment \(ℒunsup\\mathcal\{L\}\_\{unsup\}\), the Topology\-Reliable Semantic Alignment and Discovery process calibrates the isolated node representations\. This integration enforces intra\-class compactness and inter\-class separability, resulting in distinct clustering boundaries even under structural truncation\.

### 4\.4Hyperparameter Sensitivity

![Refer to caption](https://arxiv.org/html/2605.08178v1/3d_plot_optimized_Cora_sgc_weight.png)\(a\)Cora \-β\\beta
![Refer to caption](https://arxiv.org/html/2605.08178v1/3d_plot_optimized_Cora_hc_penalty.png)\(b\)Cora \-λhc\\lambda\_\{hc\}
![Refer to caption](https://arxiv.org/html/2605.08178v1/3d_plot_optimized_Cora_ema_momentum.png)\(c\)Cora \-ρ\\rho
![Refer to caption](https://arxiv.org/html/2605.08178v1/3d_plot_optimized_Cora_pseudo_std_scale.png)\(d\)Cora \-α\\alpha
![Refer to caption](https://arxiv.org/html/2605.08178v1/3d_plot_optimized_Photo_sgc_weight.png)\(e\)Photo \-β\\beta
![Refer to caption](https://arxiv.org/html/2605.08178v1/3d_plot_optimized_Photo_hc_penalty.png)\(f\)Photo \-λhc\\lambda\_\{hc\}
![Refer to caption](https://arxiv.org/html/2605.08178v1/3d_plot_optimized_Photo_ema_momentum.png)\(g\)Photo \-ρ\\rho
![Refer to caption](https://arxiv.org/html/2605.08178v1/3d_plot_optimized_Photo_pseudo_std_scale.png)\(h\)Photo \-α\\alpha

Figure 5:Hyperparameter sensitivity analysis of core components\. We select Cora as a representative small\-scale dataset and Photo as a larger\-scale dataset\. The variablesβ\\beta,λhc\\lambda\_\{hc\},ρ\\rho, andα\\alphadenote the contrastive loss weight, hierarchical clustering penalty, EMA momentum, and pseudo\-label scaling factor, respectively\. The z\-axis represents the classification accuracy\.To answerQ3, we systematically analyze the hyperparameter sensitivity of the proposed GCD\-FGL framework under the standard distributed setting \(partitioned via the Louvain algorithm\)\. We select Cora and Amazon Photo as representative small\-scale and large\-scale datasets, respectively\. Figure[5](https://arxiv.org/html/2605.08178#S4.F5)visualizes the resulting performance variations across varying values for four key parameters: the Topology\-Aware Graph Contrastive Flow weight \(β\\beta\), the Constrained Hierarchical Clustering penalty \(λhc\\lambda\_\{hc\}\), the EMA Momentum Update decay factor \(ρ\\rho\), and the dynamic threshold scaling factor \(α\\alpha\)\. The default values for each parameter are marked with dashed lines in the respective subplots\.

Empirical results demonstrate stable performance across most evaluated parameter ranges\. Specifically, parameters governing temporal prototype smoothing \(ρ\\rho\), server\-side cluster regularization \(λhc\\lambda\_\{hc\}\), and confident pseudo\-label filtering \(α\\alpha\) exhibit minimal performance fluctuations on both datasets\. This indicates that the dynamic thresholding and hierarchical routing mechanisms are stable, reducing the reliance on precise manual tuning\.

Sensitivity is observed primarily regarding the contrastive loss weight \(β\\beta\) on the denser Amazon Photo dataset \(Figure[5\(e\)](https://arxiv.org/html/2605.08178#S4.F5.sf5)\)\. In graphs with dense homophilous connections, such as Photo, the structural contrastive flow \(ℒgcl\\mathcal\{L\}\_\{gcl\}\) has a direct impact on the embedding space\. Settingβ\\betatoo low fails to fully exploit local topological homophily to counteract the neighborhood absorption effect\. Conversely, a highβ\\betavalue risks over\-regularizing the representation space, potentially interfering with the TPR\-Guided Unsupervised Flow and affecting the convergence of global prototypes\.

In contrast, on the sparser Cora dataset \(Figure[5\(a\)](https://arxiv.org/html/2605.08178#S4.F5.sf1)\), the model is less sensitive toβ\\betavariations due to its lower structural density\. Ultimately, despite the localized sensitivity on denser graphs, the overall performance variations remain bounded, suggesting that GCD\-FGL achieves reliable generalization without relying on extensive parameter tuning\.

### 4\.5Robustness

Table 4:Robustness results of GCD\-FGL under random label sparsity across five graph datasets\. The best results are highlighted in bold, and the second\-best results are underlined\. The ‘Gain‘ column shows the improvement between GCD\-FGL and the best baseline\.DatasetsMetricsAutoNovelORCAGCDSimGCDFedGCDSWIRLGCD\-FGLGainCoraHRScore34\.12±1\.7734\.12\_\{\{\\scriptstyle\\,\\pm\\,1\.77\}\}23\.39±0\.6523\.39\_\{\{\\scriptstyle\\,\\pm\\,0\.65\}\}32\.67±2\.9732\.67\_\{\{\\scriptstyle\\,\\pm\\,2\.97\}\}19\.15±2\.6819\.15\_\{\{\\scriptstyle\\,\\pm\\,2\.68\}\}36\.38±8\.2536\.38\_\{\{\\scriptstyle\\,\\pm\\,8\.25\}\}43\.26¯±1\.70\\underline\{43\.26\}\_\{\{\\scriptstyle\\,\\pm\\,1\.70\}\}55\.77±2\.25\\mathbf\{55\.77\_\{\{\\scriptstyle\\,\\pm\\,2\.25\}\}\}\+12\.51Old42\.05±5\.1642\.05\_\{\{\\scriptstyle\\,\\pm\\,5\.16\}\}39\.59±0\.3539\.59\_\{\{\\scriptstyle\\,\\pm\\,0\.35\}\}32\.31±5\.7932\.31\_\{\{\\scriptstyle\\,\\pm\\,5\.79\}\}37\.99±1\.4337\.99\_\{\{\\scriptstyle\\,\\pm\\,1\.43\}\}37\.95±17\.4037\.95\_\{\{\\scriptstyle\\,\\pm\\,17\.40\}\}47\.63¯±3\.20\\underline\{47\.63\}\_\{\{\\scriptstyle\\,\\pm\\,3\.20\}\}62\.82±3\.10\\mathbf\{62\.82\_\{\{\\scriptstyle\\,\\pm\\,3\.10\}\}\}\+15\.19New28\.71±0\.7128\.71\_\{\{\\scriptstyle\\,\\pm\\,0\.71\}\}16\.60±0\.3116\.60\_\{\{\\scriptstyle\\,\\pm\\,0\.31\}\}33\.03±0\.4533\.03\_\{\{\\scriptstyle\\,\\pm\\,0\.45\}\}12\.80±0\.3012\.80\_\{\{\\scriptstyle\\,\\pm\\,0\.30\}\}34\.93±3\.7934\.93\_\{\{\\scriptstyle\\,\\pm\\,3\.79\}\}39\.62¯±1\.80\\underline\{39\.62\}\_\{\{\\scriptstyle\\,\\pm\\,1\.80\}\}50\.15±2\.85\\mathbf\{50\.15\_\{\{\\scriptstyle\\,\\pm\\,2\.85\}\}\}\+10\.53All30\.55±0\.1030\.55\_\{\{\\scriptstyle\\,\\pm\\,0\.10\}\}34\.38±0\.5434\.38\_\{\{\\scriptstyle\\,\\pm\\,0\.54\}\}32\.93±0\.7432\.93\_\{\{\\scriptstyle\\,\\pm\\,0\.74\}\}32\.95±0\.2332\.95\_\{\{\\scriptstyle\\,\\pm\\,0\.23\}\}35\.34±4\.6135\.34\_\{\{\\scriptstyle\\,\\pm\\,4\.61\}\}40\.72¯±1\.30\\underline\{40\.72\}\_\{\{\\scriptstyle\\,\\pm\\,1\.30\}\}51\.89±2\.16\\mathbf\{51\.89\_\{\{\\scriptstyle\\,\\pm\\,2\.16\}\}\}\+11\.17CiteSeerHRScore42\.32±0\.4542\.32\_\{\{\\scriptstyle\\,\\pm\\,0\.45\}\}47\.82±0\.3847\.82\_\{\{\\scriptstyle\\,\\pm\\,0\.38\}\}37\.35±1\.3537\.35\_\{\{\\scriptstyle\\,\\pm\\,1\.35\}\}40\.60±0\.6540\.60\_\{\{\\scriptstyle\\,\\pm\\,0\.65\}\}29\.23±1\.6329\.23\_\{\{\\scriptstyle\\,\\pm\\,1\.63\}\}54\.49¯±7\.33\\underline\{54\.49\}\_\{\{\\scriptstyle\\,\\pm\\,7\.33\}\}61\.15±2\.17\\mathbf\{61\.15\_\{\{\\scriptstyle\\,\\pm\\,2\.17\}\}\}\+6\.66Old47\.22±0\.8447\.22\_\{\{\\scriptstyle\\,\\pm\\,0\.84\}\}66\.07¯±0\.63\\underline\{66\.07\}\_\{\{\\scriptstyle\\,\\pm\\,0\.63\}\}36\.05±1\.7836\.05\_\{\{\\scriptstyle\\,\\pm\\,1\.78\}\}38\.06±0\.6738\.06\_\{\{\\scriptstyle\\,\\pm\\,0\.67\}\}61\.84±0\.6661\.84\_\{\{\\scriptstyle\\,\\pm\\,0\.66\}\}51\.35±12\.8551\.35\_\{\{\\scriptstyle\\,\\pm\\,12\.85\}\}69\.59±4\.64\\mathbf\{69\.59\_\{\{\\scriptstyle\\,\\pm\\,4\.64\}\}\}\+3\.52New38\.34±0\.4938\.34\_\{\{\\scriptstyle\\,\\pm\\,0\.49\}\}37\.47±0\.4237\.47\_\{\{\\scriptstyle\\,\\pm\\,0\.42\}\}38\.75±2\.0638\.75\_\{\{\\scriptstyle\\,\\pm\\,2\.06\}\}43\.51±1\.2043\.51\_\{\{\\scriptstyle\\,\\pm\\,1\.20\}\}18\.50±1\.2618\.50\_\{\{\\scriptstyle\\,\\pm\\,1\.26\}\}58\.05±2\.68\\mathbf\{58\.05\_\{\{\\scriptstyle\\,\\pm\\,2\.68\}\}\}54\.51¯±3\.81\\underline\{54\.51\}\_\{\{\\scriptstyle\\,\\pm\\,3\.81\}\}\-3\.54All40\.30±0\.3940\.30\_\{\{\\scriptstyle\\,\\pm\\,0\.39\}\}43\.80±0\.6643\.80\_\{\{\\scriptstyle\\,\\pm\\,0\.66\}\}38\.16±1\.5638\.16\_\{\{\\scriptstyle\\,\\pm\\,1\.56\}\}42\.31±0\.8542\.31\_\{\{\\scriptstyle\\,\\pm\\,0\.85\}\}29\.81±1\.6529\.81\_\{\{\\scriptstyle\\,\\pm\\,1\.65\}\}56\.57¯±1\.81\\underline\{56\.57\}\_\{\{\\scriptstyle\\,\\pm\\,1\.81\}\}60\.21±0\.59\\mathbf\{60\.21\_\{\{\\scriptstyle\\,\\pm\\,0\.59\}\}\}\+3\.64CSHRScore58\.09¯±6\.97\\underline\{58\.09\}\_\{\{\\scriptstyle\\,\\pm\\,6\.97\}\}20\.63±0\.2620\.63\_\{\{\\scriptstyle\\,\\pm\\,0\.26\}\}55\.84±2\.1155\.84\_\{\{\\scriptstyle\\,\\pm\\,2\.11\}\}54\.08±0\.0954\.08\_\{\{\\scriptstyle\\,\\pm\\,0\.09\}\}25\.35±6\.0225\.35\_\{\{\\scriptstyle\\,\\pm\\,6\.02\}\}56\.62±8\.9856\.62\_\{\{\\scriptstyle\\,\\pm\\,8\.98\}\}69\.32±16\.74\\mathbf\{69\.32\_\{\{\\scriptstyle\\,\\pm\\,16\.74\}\}\}\+11\.23Old59\.60±14\.2359\.60\_\{\{\\scriptstyle\\,\\pm\\,14\.23\}\}61\.17±0\.1961\.17\_\{\{\\scriptstyle\\,\\pm\\,0\.19\}\}70\.51±5\.0170\.51\_\{\{\\scriptstyle\\,\\pm\\,5\.01\}\}56\.19±0\.1256\.19\_\{\{\\scriptstyle\\,\\pm\\,0\.12\}\}77\.78¯±4\.57\\underline\{77\.78\}\_\{\{\\scriptstyle\\,\\pm\\,4\.57\}\}52\.72±15\.5052\.72\_\{\{\\scriptstyle\\,\\pm\\,15\.50\}\}79\.06±28\.59\\mathbf\{79\.06\_\{\{\\scriptstyle\\,\\pm\\,28\.59\}\}\}\+1\.28New56\.66±3\.2456\.66\_\{\{\\scriptstyle\\,\\pm\\,3\.24\}\}12\.41±0\.0312\.41\_\{\{\\scriptstyle\\,\\pm\\,0\.03\}\}46\.23±1\.9246\.23\_\{\{\\scriptstyle\\,\\pm\\,1\.92\}\}52\.13±0\.1252\.13\_\{\{\\scriptstyle\\,\\pm\\,0\.12\}\}15\.14±4\.2915\.14\_\{\{\\scriptstyle\\,\\pm\\,4\.29\}\}61\.14¯±1\.95\\underline\{61\.14\}\_\{\{\\scriptstyle\\,\\pm\\,1\.95\}\}61\.72±4\.34\\mathbf\{61\.72\_\{\{\\scriptstyle\\,\\pm\\,4\.34\}\}\}\+0\.58All57\.06±1\.0857\.06\_\{\{\\scriptstyle\\,\\pm\\,1\.08\}\}54\.41±0\.0254\.41\_\{\{\\scriptstyle\\,\\pm\\,0\.02\}\}49\.59±1\.4749\.59\_\{\{\\scriptstyle\\,\\pm\\,1\.47\}\}52\.69±0\.1152\.69\_\{\{\\scriptstyle\\,\\pm\\,0\.11\}\}23\.82±3\.8423\.82\_\{\{\\scriptstyle\\,\\pm\\,3\.84\}\}59\.98¯±1\.79\\underline\{59\.98\}\_\{\{\\scriptstyle\\,\\pm\\,1\.79\}\}64\.12±0\.34\\mathbf\{64\.12\_\{\{\\scriptstyle\\,\\pm\\,0\.34\}\}\}\+4\.14PhotoHRScore41\.65±5\.2941\.65\_\{\{\\scriptstyle\\,\\pm\\,5\.29\}\}36\.59±5\.7136\.59\_\{\{\\scriptstyle\\,\\pm\\,5\.71\}\}42\.13±2\.5842\.13\_\{\{\\scriptstyle\\,\\pm\\,2\.58\}\}31\.56±0\.7231\.56\_\{\{\\scriptstyle\\,\\pm\\,0\.72\}\}43\.57±9\.1643\.57\_\{\{\\scriptstyle\\,\\pm\\,9\.16\}\}45\.34¯±2\.96\\underline\{45\.34\}\_\{\{\\scriptstyle\\,\\pm\\,2\.96\}\}48\.54±19\.12\\mathbf\{48\.54\_\{\{\\scriptstyle\\,\\pm\\,19\.12\}\}\}\+3\.20Old40\.60±5\.4540\.60\_\{\{\\scriptstyle\\,\\pm\\,5\.45\}\}30\.48±7\.6130\.48\_\{\{\\scriptstyle\\,\\pm\\,7\.61\}\}36\.84±3\.7436\.84\_\{\{\\scriptstyle\\,\\pm\\,3\.74\}\}24\.98±0\.9024\.98\_\{\{\\scriptstyle\\,\\pm\\,0\.90\}\}45\.40±14\.6345\.40\_\{\{\\scriptstyle\\,\\pm\\,14\.63\}\}54\.03¯±6\.58\\underline\{54\.03\}\_\{\{\\scriptstyle\\,\\pm\\,6\.58\}\}60\.15±10\.97\\mathbf\{60\.15\_\{\{\\scriptstyle\\,\\pm\\,10\.97\}\}\}\+6\.12New42\.75±9\.3642\.75\_\{\{\\scriptstyle\\,\\pm\\,9\.36\}\}45\.78¯±4\.98\\underline\{45\.78\}\_\{\{\\scriptstyle\\,\\pm\\,4\.98\}\}49\.18±2\.24\\mathbf\{49\.18\_\{\{\\scriptstyle\\,\\pm\\,2\.24\}\}\}42\.84±0\.3342\.84\_\{\{\\scriptstyle\\,\\pm\\,0\.33\}\}41\.89±5\.5541\.89\_\{\{\\scriptstyle\\,\\pm\\,5\.55\}\}39\.04±2\.7439\.04\_\{\{\\scriptstyle\\,\\pm\\,2\.74\}\}40\.69±26\.3940\.69\_\{\{\\scriptstyle\\,\\pm\\,26\.39\}\}\-8\.49All42\.12±5\.7042\.12\_\{\{\\scriptstyle\\,\\pm\\,5\.70\}\}41\.26±2\.2841\.26\_\{\{\\scriptstyle\\,\\pm\\,2\.28\}\}45\.53¯±1\.61\\underline\{45\.53\}\_\{\{\\scriptstyle\\,\\pm\\,1\.61\}\}37\.56±0\.2437\.56\_\{\{\\scriptstyle\\,\\pm\\,0\.24\}\}42\.93±0\.8242\.93\_\{\{\\scriptstyle\\,\\pm\\,0\.82\}\}43\.47±1\.1243\.47\_\{\{\\scriptstyle\\,\\pm\\,1\.12\}\}54\.40±7\.65\\mathbf\{54\.40\_\{\{\\scriptstyle\\,\\pm\\,7\.65\}\}\}\+8\.87ComputersHRScore39\.48±5\.4939\.48\_\{\{\\scriptstyle\\,\\pm\\,5\.49\}\}46\.64±7\.6046\.64\_\{\{\\scriptstyle\\,\\pm\\,7\.60\}\}44\.85±3\.6344\.85\_\{\{\\scriptstyle\\,\\pm\\,3\.63\}\}48\.72¯±4\.73\\underline\{48\.72\}\_\{\{\\scriptstyle\\,\\pm\\,4\.73\}\}47\.80±6\.5347\.80\_\{\{\\scriptstyle\\,\\pm\\,6\.53\}\}39\.18±6\.3839\.18\_\{\{\\scriptstyle\\,\\pm\\,6\.38\}\}49\.23±0\.54\\mathbf\{49\.23\_\{\{\\scriptstyle\\,\\pm\\,0\.54\}\}\}\+0\.51Old48\.96±10\.5948\.96\_\{\{\\scriptstyle\\,\\pm\\,10\.59\}\}51\.85¯±5\.55\\underline\{51\.85\}\_\{\{\\scriptstyle\\,\\pm\\,5\.55\}\}41\.65±6\.0341\.65\_\{\{\\scriptstyle\\,\\pm\\,6\.03\}\}51\.21±4\.0551\.21\_\{\{\\scriptstyle\\,\\pm\\,4\.05\}\}47\.60±8\.9947\.60\_\{\{\\scriptstyle\\,\\pm\\,8\.99\}\}35\.47±8\.9535\.47\_\{\{\\scriptstyle\\,\\pm\\,8\.95\}\}59\.46±0\.66\\mathbf\{59\.46\_\{\{\\scriptstyle\\,\\pm\\,0\.66\}\}\}\+7\.61New33\.08±6\.0133\.08\_\{\{\\scriptstyle\\,\\pm\\,6\.01\}\}42\.37±11\.9942\.37\_\{\{\\scriptstyle\\,\\pm\\,11\.99\}\}48\.55±2\.33\\mathbf\{48\.55\_\{\{\\scriptstyle\\,\\pm\\,2\.33\}\}\}46\.48±7\.1546\.48\_\{\{\\scriptstyle\\,\\pm\\,7\.15\}\}48\.00¯±9\.48\\underline\{48\.00\}\_\{\{\\scriptstyle\\,\\pm\\,9\.48\}\}43\.72±8\.2543\.72\_\{\{\\scriptstyle\\,\\pm\\,8\.25\}\}42\.01±0\.7242\.01\_\{\{\\scriptstyle\\,\\pm\\,0\.72\}\}\-6\.54All41\.32±6\.7741\.32\_\{\{\\scriptstyle\\,\\pm\\,6\.77\}\}47\.29±3\.9747\.29\_\{\{\\scriptstyle\\,\\pm\\,3\.97\}\}44\.97±2\.5344\.97\_\{\{\\scriptstyle\\,\\pm\\,2\.53\}\}48\.93¯±2\.10\\underline\{48\.93\}\_\{\{\\scriptstyle\\,\\pm\\,2\.10\}\}47\.80±3\.5247\.80\_\{\{\\scriptstyle\\,\\pm\\,3\.52\}\}39\.44±1\.3039\.44\_\{\{\\scriptstyle\\,\\pm\\,1\.30\}\}51\.54±0\.39\\mathbf\{51\.54\_\{\{\\scriptstyle\\,\\pm\\,0\.39\}\}\}\+2\.61Note:The random label sparsity rate is set to 10%\.

To answerQ4, we evaluate the robustness of GCD\-FGL under label sparsity within the distributed federated setting \(partitioned via the Louvain algorithm\)\. We set the random label sparsity rate to 10% across all clients\. In FGGCD, limited supervisory signals, compounded by heterogeneous and partially overlapping label spaces, can predispose models to overfit local known categories, thereby limiting their capacity to discover novel categories\. Quantitative results under this sparse regime are detailed in Table[4](https://arxiv.org/html/2605.08178#S4.T4)\.

Under these conditions, GCD\-FGL demonstrates stable performance, achieving the highest HRScore and overall accuracy \(All Acc\) on all five benchmark datasets \(Cora, CiteSeer, Coauthor CS, Amazon Photo, and Amazon Computers\)\. For instance, on the CS dataset, the framework yields an HRScore of 69\.32% and a novel category accuracy \(New Acc\) of 61\.72%\. On the Computers dataset, it records an HRScore of 49\.23%\. This performance is primarily attributed to the integration of the TRG mechanism and the client\-side semantic alignment flow\. When labeled anchors are scarce, these components filter heterophilous noise and regularize node\-to\-prototype assignments based on structural homophily, enabling the model to leverage unlabeled nodes for representation learning\.

While certain baseline methods perform well on isolated metrics or datasets, such as GCD regarding novel category discovery \(New Acc\) on the Amazon Photo dataset or ORCA regarding known category retention \(Old Acc\) on CiteSeer, they exhibit performance imbalances under sparse conditions\. For example, on the CS dataset, FedGCD achieves an Old Acc of 77\.78% but a degraded New Acc of 15\.14%\. This indicates that, without adequate structural regularization, baselines tend to exhibit local representation bias, overfit to the limited known labels, and struggle to generalize to novel categories\. In contrast, GCD\-FGL mitigates semantic divergence, maintaining balanced predictions across evaluation metrics despite sparsity\.

### 4\.6Efficiency

To answerQ5, we evaluate the training efficiency of GCD\-FGL by analyzing the evolution of predictive performance relative to wall\-clock running time\. This evaluation is conducted under the distributed setting \(partitioned via the Louvain algorithm\)\. Figure[6](https://arxiv.org/html/2605.08178#S4.F6)illustrates the training efficiency curves, reporting the mean performance and standard deviation across four datasets \(Cora, CiteSeer, Coauthor CS, and Amazon Photo\)\.

Regarding learning dynamics, the performance curves of GCD\-FGL exhibit faster initial convergence across all datasets, reaching performance plateaus earlier than the baselines\. This efficiency is driven by the Density\-Aware Aggregation and the Hierarchical Prototype Alignment mechanisms\. By aligning the global semantic space and filtering out unreliable local updates, these components provide consistent optimization directions, reducing the total number of communication rounds required for convergence\. However, a late\-stage oscillation is observed on the Cora dataset, where classification accuracy degrades slightly during the final training epochs, likely due to over\-regularization on its relatively sparse topology\.

![Refer to caption](https://arxiv.org/html/2605.08178v1/Cora_efficiency.png)\(a\)Cora
![Refer to caption](https://arxiv.org/html/2605.08178v1/CiteSeer_efficiency.png)\(b\)CiteSeer
![Refer to caption](https://arxiv.org/html/2605.08178v1/CS_efficiency.png)\(c\)Coauthor CS
![Refer to caption](https://arxiv.org/html/2605.08178v1/Photo_efficiency.png)\(d\)Amazon Photo

Figure 6:Training efficiency on four datasets\. Solid lines denote mean performance over multiple runs, and shaded regions indicate standard deviation\.![Refer to caption](https://arxiv.org/html/2605.08178v1/efficiency_comparison.png)Figure 7:Comparison of HRScore, All Acc, and average running time across four datasets\.In terms of computational cost, the efficiency of GCD\-FGL shows a dependence on graph scale, as further visualized in Figure[7](https://arxiv.org/html/2605.08178#S4.F7)\. On smaller datasets \(Cora and CiteSeer\), the total running time is comparable to the baselines\. Conversely, on denser and larger datasets \(CS and Photo\), GCD\-FGL incurs an increase in computational overhead per communication round\. This discrepancy stems from the TRG mechanism and the Topology\-Aware Graph Contrastive Flow calculations\. As the graph volume expands, computing local structural smoothness and executing the false\-positive truncation mechanism introduce additional computational cost\.

Among the baselines, SWIRL demonstrates competitive convergence stability\. As an architecture tailored for holistic GGCD, it leverages topological dependencies better than image\-data GCD baselines \(AutoNovel, ORCA, and GCD\), which exhibit performance bottlenecks on these datasets\. Nevertheless, lacking dedicated mechanisms to mitigate subgraph distribution discrepancies and heterogeneous label spaces, SWIRL’s convergence trajectory ultimately plateaus below that of GCD\-FGL\. Overall, while GCD\-FGL incurs an increased per\-round computational cost on large\-scale graphs, Figure[7](https://arxiv.org/html/2605.08178#S4.F7)demonstrates that this overhead is offset by its faster convergence rate and higher final accuracy, presenting a practical trade\-off for realistic FGGCD scenarios\.

## 5Conclusion

In this paper, we propose GCD\-FGL, a framework tailored for FGGCD\. A primary challenge in FGGCD is the consistent alignment of dynamically discovered novel categories across isolated and heterogeneous clients, which complicates effective global knowledge aggregation\. To address this issue, our method integrates a client\-side Topology\-Reliable Semantic Alignment and Discovery process, guided by the TRG mechanism, with a server\-side Hierarchical Prototype Alignment strategy\. Through this architecture, the proposed framework achieves cross\-client alignment for novel categories and establishes distinct feature spaces that respect both semantic boundaries and topological homophily\.

Experiments across five benchmark datasets demonstrate that GCD\-FGL achieves state\-of\-the\-art performance\. Robustness and ablation analyses further validate the stability of our method, demonstrating its resilience to label sparsity without exacerbating catastrophic forgetting\. While GCD\-FGL improves accuracy and stability, it incurs a trade\-off in training efficiency\. The local Topology\-Aware Graph Contrastive Flow and the global Hierarchical Prototype Alignment procedures entail higher computational overhead and longer per\-round running times compared to baseline methods\. Nevertheless, by addressing the cross\-client novel category alignment problem, this work provides an effective and practical solution for FGGCD\.

\\printcredits

## Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper\.

## Acknowledgements

This research was supported by Shenzhen Fundamental Research Program \(JCYJ20230807094104009\)

## Data availability

The data used in this research are publicly available graph datasets\.

## References
Generalized Category Discovery in Federated Graph Learning

Similar Articles

Federated Learning

A Comparative Study of Federated Learning Aggregation Strategies under Homogeneous and Heterogeneous Data Distributions

FedeKD: Energy-Based Gating for Robust Federated Knowledge Distillation under Heterogeneous Settings

Beyond Factor Aggregation: Gauge-Aware Low-Rank Server Representations for Federated LoRA

GraphDC: A Divide-and-Conquer Multi-Agent System for Scalable Graph Algorithm Reasoning

Submit Feedback

Similar Articles

A Comparative Study of Federated Learning Aggregation Strategies under Homogeneous and Heterogeneous Data Distributions
FedeKD: Energy-Based Gating for Robust Federated Knowledge Distillation under Heterogeneous Settings
Beyond Factor Aggregation: Gauge-Aware Low-Rank Server Representations for Federated LoRA
GraphDC: A Divide-and-Conquer Multi-Agent System for Scalable Graph Algorithm Reasoning