Convex--Concave Quadratic Spectral Filtering for Graph Neural Networks
Summary
Proposes DCQ-GNN, a spectral GNN that uses a compact bank of adaptive convex-concave quadratic filters to improve spectral selectivity without high-order polynomials, achieving competitive results on both homophilic and heterophilic graphs.
View Cached Full Text
Cached at: 06/25/26, 05:07 AM
# Convex–Concave Quadratic Spectral Filtering for Graph Neural Networks††thanks: Accepted at ECML-PKDD 2026. The final authenticated publication will be available online via Springer LNCS. The work described in this paper was supported partially by the National Natural Science Foundation of China (12271111).
Source: [https://arxiv.org/html/2606.24956](https://arxiv.org/html/2606.24956)
Ranhui Yan1, Jia Cai2, Mengzhu Chen2, Haodong Yang2 1School of Computing and Artificial Intelligence, Guangzhou Xinhua University, Guangzhou, Guangdong, China yan\_rh1999@outlook\.com 2School of Statistics and Data Science, Guangdong University of Finance and Economics, Guangzhou, Guangdong, China jiacai1999@gdufe\.edu\.cn,mengzhuchen@student\.gdufe\.edu\.cn,haodongyang@student\.gdufe\.edu\.cn
###### Abstract
Spectral graph neural networks \(GNNs\) interpret message passing as frequency\-selective filtering\. While low\-order spectral filters are efficient, their limited selectivity often leads to weak attenuation outside the passband, whereas high\-order alternatives introduce optimization challenges\. We propose DCQ\-GNN, a spectral GNN based on a compact bank of adaptive convex–concave quadratic filters\. By restricting the filter order to two while explicitly exploiting complementary curvature, DCQ\-GNN improves spectral selectivity as quantified by Dirichlet energy and entropy measures without resorting to high\-order polynomial expansions\. The model fuses filter outputs through a node\-adaptive gating mechanism to enable node\-wise structure\-aware spectral selection\. We provide a formal spectral analysis grounded in Dirichlet energy attenuation, von Neumann entropy, and curvature polarity, and derive explicit characterizations of filter behavior across varying levels of homophily and structural perturbations\. Extensive benchmarks on 10 datasets show that DCQ\-GNN ties for the top average rank \(3\.0\) on heterophilic graphs and obtains the second\-best rank \(4\.2\) on homophilic graphs, remaining competitive with representative high\-order polynomial spectral filters\. Furthermore, under strong structural perturbations, DCQ\-GNN exhibits substantially smaller performance degradation compared to both first\-order and high\-order baselines\. These results demonstrate that curvature\-aware quadratic banks provide a robust and efficient alternative to high\-order spectral models while preserving optimization stability and computational efficiency\.
## 1Introduction
Graph neural networks \(GNNs\) have achieved strong performance across a wide range of relational learning tasks\. From a spectral perspective, many GNN architectures can be interpreted as frequency\-selective filters applied to the graph Laplacian, where message passing corresponds to polynomial filtering in the spectral domain\. This interpretation has inspired a large body of spectral GNN models that explicitly design filter responses over graph frequencies\. Low\-order spectral filters are efficient and stable but exhibit limited selectivity: attenuation outside the passband is gradual and may obscure structural signals \(Fig\.[1](https://arxiv.org/html/2606.24956#S1.F1)\)\. High\-order polynomial filters achieve sharper responses but introduce optimization challenges and sensitivity to structural perturbations\.
Figure 1:Spectral responses of linear and convex–concave quadratic Laplacian filters over the normalized eigenvalue range\[0,2\]\[0,2\]\. The quadratic filter exhibits sharper curvature near the cutoff region compared to the linear baseline\.Existing spectral GNNs primarily pursue expressivity by increasing polynomial order or expanding the filter basis family\. In contrast, we argue that the primary design axis of spectral filters need not be polynomial approximation order, but rather curvature polarity, i\.e\., explicit control over convexity and concavity of the spectral response\. This perspective shifts the goal from universal approximation toward optimizing the stability–selectivity trade\-off under low\-order constraints\. Curvature polarity constrains second derivative sign globally, which cannot be guaranteed in unconstrained polynomial expansions\.
We propose DCQ\-GNN, a spectral GNN based on a compact bank of adaptive convex–concave quadratic filters\. Instead of increasing polynomial degree, DCQ\-GNN fixes the filter order to two and introduces complementary curvature polarities to shape spectral attenuation\. Pairing convex and concave responses yields sharper spectral transitions than linear filters without the instability of high\-order polynomials\. This curvature\-centric design distinguishes DCQ\-GNN from basis\-expansion approaches such as Bernstein\- or Jacobi\-polynomial filters, which improve spectral flexibility via higher\-order polynomial expansions\. Our approach instead leverages controlled curvature to enhance spectral selectivity per parameter, prioritizing robustness and stability over universal approximation capacity\.
To enable structure\-aware spectral adaptation, DCQ\-GNN fuses filter outputs via a node\-adaptive gating mechanism\. This gating module performs localized node\-wise spectral selection, allowing different nodes to emphasize convex or concave responses depending on structural context, thereby introducing node\-wise spectral bias without increasing filter order\. We provide a formal spectral analysis grounded in Dirichlet energy decay gap, von Neumann entropy evolution, and curvature polarity\. Our analysis characterizes how convex and concave quadratic components induce distinct attenuation regimes, revealing a controllable mechanism for modulating spectral energy decay under varying homophily levels and structural perturbations\. Extensive experiments on ten benchmark datasets demonstrate that DCQ\-GNN achieves competitive performance across both homophilic and heterophilic graphs\. Notably, under strong structural perturbations, DCQ\-GNN exhibits smaller performance degradation compared to both first\-order and high\-order baselines, supporting our hypothesis that curvature\-aware low\-order filtering improves robustness\. Our contributions can be summarized as follows:
- •We introduce curvature polarity as an explicit constraint for second\-order spectral GNN filters\. By pairing convex and concave responses, DCQ\-GNN spans the full quadratic filter space characterized in Theorem[8](https://arxiv.org/html/2606.24956#Thmtheorem8), rather than selecting filters by heuristic frequency centers or bandwidths\.
- •We propose a compact convex–concave quadratic filter bank combined with node\-adaptive gating\.
- •We provide a theoretical spectral interpretation via Dirichlet energy decay gap and entropy analysis\.
- •We demonstrate competitive performance and improved robustness across diverse graph regimes spanning both homophilic and heterophilic settings, with computational costs comparable to low\-order propagation\.
Table 1:Performance comparison between DCQ\-GNN and representative polynomial spectral GNNs, including BernNet and JacobiConv\.Compared to Bernstein\- and Jacobi\-based spectral filters that increase polynomial order to enhance flexibility, our approach instead exploits curvature polarity within a fixed second\-order formulation \(Table[1](https://arxiv.org/html/2606.24956#S1.T1)\)\. Detailed comparisons with Bernstein\- and Jacobi\-based spectral filters are provided in Appendix[B](https://arxiv.org/html/2606.24956#A2)\.
## 2The Convex–Concave Quadratic Filter GNN Architecture
Figure 2:Architecture of DCQ\-GNN\. Parallel convex and concave quadratic filters extract complementary frequency components, whose outputs are fused through a node\-adaptive gating mechanism with an all\-pass residual anchor\.We propose DCQ\-GNN, a spectral graph neural network designed to address the limited frequency selectivity of linear \(first\-order\) Laplacian filters while preserving the computational efficiency and numerical stability of low\-order message passing mechanisms\. Rather than increasing polynomial degree, DCQ\-GNN enhances spectral expressivity through a compact bank of adaptive quadratic filters, whose convex–concave curvature induces complementary frequency responses within a strictly second\-order formulation\.
As illustrated in Fig\.[2](https://arxiv.org/html/2606.24956#S2.F2), each DCQ\-GNN layer consists of two components:
- •Adaptive convex–concave quadratic filter bank\.A small set of second\-order spectral filters that extract low\-, mid\-, and high\-frequency components, together with an all\-pass \(identity\) channel\. The filters are parameterized to adapt their transition regions and magnitudes, enabling frequency\-aware representations across graphs with diverse homophily characteristics\.
- •Node\-adaptive gated fusion\.A gating mechanism that combines the filtered channels on a per\-node basis, allowing the model to emphasize structure\-dependent spectral responses when the structure is informative, and to fall back on the all\-pass anchor when graph structure is unreliable or noisy\.
We next formalize the spectral setting, describe the construction of the convex–concave quadratic filter bank, and introduce the node\-adaptive fusion mechanism\.
### 2\.1Preliminaries
LetG=\(𝒱,ℰ\)G=\(\\mathcal\{V\},\\mathcal\{E\}\)be an undirected graph withNNnodes, adjacency matrixA∈ℝN×NA\\in\\mathbb\{R\}^\{N\\times N\}, and diagonal degree matrixDD\. We utilize the symmetrically normalized Laplacian𝐋=I−D−1/2AD−1/2=UΛU⊤\{\\mathbf\{L\}\}=I\-D^\{\-1/2\}AD^\{\-1/2\}=U\{\\Lambda\}U^\{\\top\}, where0≤λ1≤⋯≤λN≤20\\leq\\lambda\_\{1\}\\leq\\dots\\leq\\lambda\_\{N\}\\leq 2\. For node featuresX∈ℝN×FX\\in\\mathbb\{R\}^\{N\\times F\}, a spectral filterg:\[0,2\]→ℝg:\[0,2\]\\rightarrow\\mathbb\{R\}is defined asg\(𝐋\)X=Ug\(Λ\)U⊤Xg\(\\mathbf\{L\}\)X=Ug\(\\Lambda\)U^\{\\top\}X\. To ensure computational efficiency and local message passing, we restrictggto quadratic polynomialsg\(λ\)=a0\+a1λ\+a2λ2g\(\\lambda\)=a\_\{0\}\+a\_\{1\}\\lambda\+a\_\{2\}\\lambda^\{2\}witha0a\_\{0\},a1a\_\{1\}, anda2a\_\{2\}constants, which avoids explicit eigendecomposition while enabling nonlinear spectral shaping\. DCQ\-GNN explicitly leverages the curvature of these second\-order filters to refine frequency responses without increasing the polynomial degree\.
### 2\.2Adaptive Convex–Concave Quadratic Filter Bank
We construct a compact filter bankℬ=\{glow,gmid,ghigh,gall\}\\mathcal\{B\}=\\\{g\_\{\\text\{low\}\},g\_\{\\text\{mid\}\},g\_\{\\text\{high\}\},g\_\{\\text\{all\}\}\\\}, where eachg\(λ\)g\(\\lambda\)is a quadratic polynomial on\[0,2\]\[0,2\]\. Each low\- or high\-pass channel contains convex–concave variants, but the gating mechanism aggregates them into four effective frequency channels\. To introduce adaptivity with minimal parameterization, we employ two learnable scalar parameters: \(i\) a cutoff proxyλ~=2−α\{\\tilde\{\\lambda\}\}=2\-\\alphawithα∈\[0,1\]\\alpha\\in\[0,1\], which controls the spectral location of the quadratic extremum; and \(ii\) a positive scale0<β≤min\{1,λ~2/4\}0<\\beta\\leq\\min\\\{1,\{\\tilde\{\\lambda\}\}^\{2\}/4\\\}111This specific bound onβ\\betais necessary and sufficient to guaranteemaxλ∈\[0,2\]\|g\(λ\)\|≤1\.\\max\_\{\\lambda\\in\[0,2\]\}\|g\(\\lambda\)\|\\leq 1\., which controls the response magnitude and ensures well\-defined curvature\.
Filter Channels\.Let𝐙1=𝐋X\{\\bf Z\}\_\{1\}=\{\\bf L\}X,𝐙2=𝐋𝐙1=𝐋2X\{\\bf Z\}\_\{2\}=\{\\bf L\}\{\\bf Z\}\_\{1\}=\{\\bf L\}^\{2\}X\. Each channel can be expressed as a linear combination ofXX,𝐙1\{\\bf Z\}\_\{1\}, and𝐙2\{\\bf Z\}\_\{2\}, allowing efficient implementation through sparse propagation\.
1. 1\.Low\-pass channel \(homophily\)\.Defineconvex low\-passas glowcvx\(𝐋\)X=βλ~2\(𝐙2−2λ~𝐙1\+λ~2X\),g\_\{\\text\{low\}\}^\{\\text\{cvx\}\}\(\{\\bf L\}\)X=\\frac\{\\beta\}\{\{\\tilde\{\\lambda\}\}^\{2\}\}\(\{\\bf Z\}\_\{2\}\-2\{\\tilde\{\\lambda\}\}\{\\bf Z\}\_\{1\}\+\{\\tilde\{\\lambda\}\}^\{2\}X\),\(1\)which induces sharper attenuation in the high\-frequency regime, andconcave low\-passas glowccv\(𝐋\)X=−βλ~2\(𝐙2−λ~2X\),g\_\{\\text\{low\}\}^\{\\text\{ccv\}\}\(\{\\bf L\}\)X=\-\\frac\{\\beta\}\{\{\\tilde\{\\lambda\}\}^\{2\}\}\(\{\\bf Z\}\_\{2\}\-\{\\tilde\{\\lambda\}\}^\{2\}X\),\(2\)which maintains a broader effective passband across low\-to\-mid frequencies\.
2. 2\.High\-pass channel \(heterophily\)\.To emphasize heterophilic high\-frequency components, we useconvex high\-pass ghighcvx\(𝐋\)X=βλ~2𝐙2,g\_\{\\text\{high\}\}^\{\\text\{cvx\}\}\(\{\\bf L\}\)X=\\frac\{\\beta\}\{\{\\tilde\{\\lambda\}\}^\{2\}\}\{\\bf Z\}\_\{2\},\(3\)which strongly suppresses low\-frequency content, andconcave high\-pass ghighccv\(𝐋\)X=−βλ~2\(𝐙2−2λ~𝐙1\)g\_\{\\text\{high\}\}^\{\\text\{ccv\}\}\(\{\\bf L\}\)X=\-\\frac\{\\beta\}\{\{\\tilde\{\\lambda\}\}^\{2\}\}\(\{\\bf Z\}\_\{2\}\-2\{\\tilde\{\\lambda\}\}\{\\bf Z\}\_\{1\}\)\(4\)to yield a wider active region in the high\-frequency regime\.
3. 3\.Mid\-pass:gmid\(𝐋\)X=−4βλ~2\(𝐙2−λ~𝐙1\)g\_\{\\text\{mid\}\}\(\{\\bf L\}\)X=\-\\frac\{4\\beta\}\{\{\\tilde\{\\lambda\}\}^\{2\}\}\(\{\\bf Z\}\_\{2\}\-\{\\tilde\{\\lambda\}\}\{\\bf Z\}\_\{1\}\), a non\-monotone response emphasizing intermediate frequencies, which are empirically more robust to structural perturbations\[[13](https://arxiv.org/html/2606.24956#bib.bib37)\]\.
4. 4\.All\-pass \(identity\):gall\(𝐋\)X=Xg\_\{\\text\{all\}\}\(\{\\bf L\}\)X=X, which preserves raw features and serves as a structure\-agnostic anchor\.
##### Why quadratic rather than alternative structured bases?
At fixed orderK=2K=2, Chebyshev and Jacobi bases span the same three\-dimensional polynomial space asa0\+a1λ\+a2λ2a\_\{0\}\+a\_\{1\}\\lambda\+a\_\{2\}\\lambda^\{2\}under an invertible change of basis\. They therefore do not enlarge the admissible spectral response class, while making curvature polarity less explicit\. Rational filtersg\(λ\)=p\(λ\)/q\(λ\)g\(\\lambda\)=p\(\\lambda\)/q\(\\lambda\)can realize sharper roll\-offs, but possible poles introduce additional constraints onq\(λ\)q\(\\lambda\)and complicate the uniform boundedness requirement\|g\(λ\)\|≤1\|g\(\\lambda\)\|\\leq 1on\[0,2\]\[0,2\]\. Wavelet\- or framelet\-based decompositions provide multi\-scale separation, but require frame construction and scale\-selection constraints that do not align naturally with the learnable cutoffλ~\\tilde\{\\lambda\}\. The quadratic class is therefore the lowest\-degree polynomial family that simultaneously \(i\) encodes sign\-definite curvature through the single coefficienta2a\_\{2\}, \(ii\) admits an analytic Lipschitz stability bound under Laplacian perturbations \(Theorem[5](https://arxiv.org/html/2606.24956#Thmtheorem5)\), and \(iii\) recovers a second\-order Taylor form of heat diffusion, which motivates the convex low\-pass branch\.
The negative signs are introduced to ensure passband alignment\. The curvature parameters may also be treated as hyperparameters, allowing the model to adapt to graphs with different homophily characteristics\. Specifically, the convex low\-pass filter defined in Eq\. \([1](https://arxiv.org/html/2606.24956#S2.E1)\) yields
glowcvx\(λ\)=βλ~2\(λ2−2λ~λ\+λ~2\),g\_\{\\text\{low\}\}^\{\\text\{cvx\}\}\(\\lambda\)=\\frac\{\\beta\}\{\{\\tilde\{\\lambda\}\}^\{2\}\}\\left\(\\lambda^\{2\}\-2\{\\tilde\{\\lambda\}\}\\lambda\+\{\\tilde\{\\lambda\}\}^\{2\}\\right\),which corresponds to a second\-order Taylor approximation of the heat kernele−τλe^\{\-\\tau\\lambda\}\. Consequently, the convex low\-pass filter primarily acts as a denoising operator by attenuating high\-frequency components\. On the other hand, Whenλ~=1\\tilde\{\\lambda\}=1and the maximal admissible scaleβ=1/4\\beta=1/4is used, the filter pair\{glowcvx,ghighccv\}\\\{g\_\{\\text\{low\}\}^\{\\text\{cvx\}\},g\_\{\\text\{high\}\}^\{\\text\{ccv\}\}\\\}constitutes a scaled partition of unity, since
glowcvx\(λ\)\+ghighccv\(λ\)=β=1/4,∀λ∈\[0,λ~\]\.g\_\{\\text\{low\}\}^\{\\text\{cvx\}\}\(\\lambda\)\+g\_\{\\text\{high\}\}^\{\\text\{ccv\}\}\(\\lambda\)=\\beta=1/4,\\quad\\forall\\lambda\\in\[0,\\tilde\{\\lambda\}\]\.This property ensures that the filter bank uniformly covers the spectral domain without frequency gaps, which is a fundamental requirement for stable signal reconstruction in graph signal processing \(GSP\)\. Unlike standard GCNs, which employ a fixed linear filter1−λ1\-\\lambda, quadratic filters enable frequency\-dependent gain modulation across the spectrum\. This design is closely related to quadratic mirror filters in classical signal processing, where signals are decomposed into sub\-bands using filters with complementary curvatures\. From a spectral geometry perspective, the quadratic term introduces controllable curvature in the frequency response, allowing the model to shape the graph signal manifold beyond the linear bias of traditional GCN filters\.
### 2\.3Node\-Adaptive Gated Fusion
LetHconcat=\[Hlow\|Hmid\|Hhigh\|Hall\]∈ℝN×4FH\_\{\\text\{concat\}\}=\[H\_\{\\text\{low\}\}\|H\_\{\\text\{mid\}\}\|H\_\{\\text\{high\}\}\|H\_\{\\text\{all\}\}\]\\in\\mathbb\{R\}^\{N\\times 4F\}\. We compute node\-wise gating coefficientsS=softmax\(HconcatWgate\+bgate\)S=\\mathrm\{softmax\}\(H\_\{\\text\{concat\}\}W\_\{\\text\{gate\}\}\+b\_\{\\text\{gate\}\}\), whereWgate∈ℝ4F×4W\_\{\\text\{gate\}\}\\in\\mathbb\{R\}^\{4F\\times 4\}andbgate∈ℝ4b\_\{\\text\{gate\}\}\\in\\mathbb\{R\}^\{4\}are learnable parameters\. The layer output is
Hout=σ\(∑k∈\{low,mid,high,all\}Sk⊙\(HkΘk\)\),\{H\_\{\\text\{out\}\}\}=\\sigma\\left\(\\sum\_\{k\\in\\\{\\text\{low\},\\text\{mid\},\\text\{high\},\\text\{all\}\\\}\}S\_\{k\}\\odot\(H\_\{k\}\{\\Theta\}\_\{k\}\)\\right\),whereHk∈ℝN×FH\_\{k\}\\in\{\\mathbb\{R\}\}^\{N\\times F\}is the output of thekk\-th filter channel,Θk∈ℝF×F′\{\\Theta\}\_\{k\}\\in\\mathbb\{R\}^\{F\\times F^\{\\prime\}\}is a channel\-specific linear matrix,SkS\_\{k\}denotes thekk\-th column ofSSbroadcast across feature dimensions,⊙\\odotstands for elementwise multiplication, andσ\\sigmadenotes an activation function\. This mechanism enables a node\-wise spectral mixture model, allowing DCQ\-GNN to adaptively down\-weight structure\-dependent channels and rely on the all\-pass anchor when appropriate\.
### 2\.4Theoretical Analysis
DCQ\-GNN is based on the observation that quadratic spectral filters can improve frequency selectivity over linear filters while preserving low\-order propagation\. We analyze this property from two perspectives: high\-frequency Dirichlet energy attenuation and spectral concentration\. Theorem[1](https://arxiv.org/html/2606.24956#Thmtheorem1)shows that, for any suppression band\[λ∗,2\]\[\\lambda^\{\*\},2\], a bounded strictly quadratic filter can attain a lower worst\-case weighted Dirichlet energy than the optimal bounded linear filtergℓ\(λ\)=1−aλg\_\{\\ell\}\(\\lambda\)=1\-a\\lambda\. This result explains why the quadratic channels in DCQ\-GNN can suppress high\-frequency noise without increasing message\-passing depth\. Theorem[2](https://arxiv.org/html/2606.24956#Thmtheorem2)further shows that a normalized convex quadratic low\-pass filter induces lower von Neumann entropy than its linear counterpart, indicating reduced spectral leakage within each low\-pass channel before node\-adaptive gated fusion\.
#### 2\.4\.1High\-Frequency Dirichlet Energy Decay Gap\.
We consider the normalized Laplacian𝐋\\mathbf\{L\}with spectrumλ∈\[0,2\]\\lambda\\in\[0,2\]\. For a graph signalxx, its Dirichlet energy after applying a spectral filterggis:
ℰg\(x\)=∑i=1Nλi\(g\(λi\)x^i\)2,\\mathcal\{E\}\_\{g\}\(x\)=\\sum\_\{i=1\}^\{N\}\\lambda\_\{i\}\(g\(\\lambda\_\{i\}\)\\hat\{x\}\_\{i\}\)^\{2\},withx^i\\hat\{x\}\_\{i\}theii\-th component ofx^=U⊤x\\hat\{x\}=U^\{\\top\}x\. We analyze worst\-case attenuation over the high\-frequency band
ℋλ∗:=\{λ:λ≥λ∗\},λ∗∈\(0,2\)\.\\mathcal\{H\}\_\{\\lambda^\{\*\}\}:=\\\{\\lambda:\\lambda\\geq\\lambda^\{\*\}\\\},\\quad\\lambda^\{\*\}\\in\(0,2\)\.Then we have the following result\.
###### Theorem 1\(High\-Frequency Dirichlet Energy Decay Gap \)
Let𝒢L=\{1−aλ∣a∈\[0,1\]\}\\mathcal\{G\}\_\{L\}=\\\{1\-a\\lambda\\mid a\\in\[0,1\]\\\}denote the class of linear spectral filters, and𝒢Q=\{1−aλ\+bλ2∣\|g\(λ\)\|≤1,∀λ∈\[0,2\]\}\\mathcal\{G\}\_\{Q\}=\\\{1\-a\\lambda\+b\\lambda^\{2\}\\mid\|g\(\\lambda\)\|\\leq 1,\\forall\\lambda\\in\[0,2\]\\\}denote the class of quadratic spectral filters\. For any transition frequencyλ∗∈\(0,2\)\\lambda^\{\*\}\\in\(0,2\), letgℓ∗∈𝒢Lg\_\{\\ell\}^\{\*\}\\in\\mathcal\{G\}\_\{L\}be the optimal linear filter minimizing the maximum weighted Dirichlet energy in the high\-frequency band\[λ∗,2\]\[\\lambda^\{\*\},2\]\. Then, there exists a strictly quadratic filtergq∈𝒢Qg\_\{q\}\\in\\mathcal\{G\}\_\{Q\}\(whereb≠0b\\neq 0\) that achieves a strictly lower energy bound:
supλ∈\[λ∗,2\]λ\|gq\(λ\)\|2<supλ∈\[λ∗,2\]λ\|gℓ∗\(λ\)\|2\\sup\_\{\\lambda\\in\[\\lambda^\{\*\},2\]\}\\lambda\|g\_\{q\}\(\\lambda\)\|^\{2\}<\\sup\_\{\\lambda\\in\[\\lambda^\{\*\},2\]\}\\lambda\|g\_\{\\ell\}^\{\*\}\(\\lambda\)\|^\{2\}
Theorem[1](https://arxiv.org/html/2606.24956#Thmtheorem1)shows that quadratic filters achieve strictly stronger attenuation within the transition band\[λ∗,λmax\]\[\\lambda^\{\*\},\\lambda\_\{\\max\}\]than any linear filter satisfying the same boundedness constraint\. This result explains why DCQ\-GNN can obtain sharper spectral discrimination without resorting to high\-order polynomial expansions\. Importantly, this improved attenuation arises within a strictly second\-order formulation, preserving the stability and efficiency of low\-order propagation\. The quadratic term introduces controllable curvature that accelerates high\-frequency decay while satisfying\|g\(λ\)\|≤1\|g\(\\lambda\)\|\\leq 1\.
#### 2\.4\.2Spectral Selectivity via von Neumann Entropy \(a measure of spectral spread\)\.
We quantify the spectral footprint of a filter through von Neumann entropy\. LetP=g\(𝐋\)2Tr\(g\(𝐋\)2\)P=\\frac\{g\(\\mathbf\{L\}\)^\{2\}\}\{\\mathrm\{Tr\}\(g\(\\mathbf\{L\}\)^\{2\}\)\}be the normalized power\-response matrix, with eigenvalues\{ηi\}i=1N\\\{\\eta\_\{i\}\\\}\_\{i=1\}^\{N\}\. The entropy isSVN\(P\)=−Tr\(PlnP\)S\_\{\\mathrm\{VN\}\}\(P\)=\-\{\\mathrm\{T\}r\}\(P\\ln P\), lower values indicate a more concentrated \(more selective\) spectral response\. Since𝐋\\mathbf\{L\}is diagonalizable, the entropy reduces to a spectral distribution over eigenvalues, allowing the analysis to be expressed directly in terms of the spectral responseg\(λ\)g\(\\lambda\)\. This leads to the following result\.
###### Theorem 2\(Entropy Majorization\)
Letg1\(λ\)=α1\(λmax−λ\)g\_\{1\}\(\\lambda\)=\\alpha\_\{1\}\(\\lambda\_\{\\max\}\-\\lambda\)andg2\(λ\)=α2\(λmax−λ\)2g\_\{2\}\(\\lambda\)=\\alpha\_\{2\}\(\\lambda\_\{\\max\}\-\\lambda\)^\{2\}be a linear and a convex quadratic low\-pass spectral filter withλmax=2\\lambda\_\{\\max\}=2, respectively, whereα1,α2\\alpha\_\{1\},\\alpha\_\{2\}are normalization constants such that∑ig2\(λi\)=1\\sum\_\{i\}g^\{2\}\(\\lambda\_\{i\}\)=1\. Let𝐩,𝐪∈ℝN\{\\bf p\},\{\\bf q\}\\in\\mathbb\{R\}^\{N\}be the resulting normalized power spectral distributions, wherepi=g12\(λi\)p\_\{i\}=g\_\{1\}^\{2\}\(\\lambda\_\{i\}\)andqi=g22\(λi\)q\_\{i\}=g\_\{2\}^\{2\}\(\\lambda\_\{i\}\)\. Then, the quadratic filter induces a more concentrated spectral response in the sense of von Neumann entropy:
SVN\(𝐪\)≤𝐒VN\(𝐩\)S\_\{\\mathrm\{VN\}\}\(\\bf q\)\\leq S\_\{\\mathrm\{VN\}\}\(\\bf p\)with equality if and only if all non\-zero eigenvalues of the Laplacian are identical\.
DCQ\-GNN uses a filter bank rather than a single operator\. Lower entropy for an individual channel indicates reduced spectral leakage within that channel, facilitating cleaner decomposition before gated recombination\. Intuitively, the quadratic spectral decay concentrates energy more strongly near the passband\. Hence, quadratic filters induce lower spectral entropy, indicating stronger concentration of spectral energy\. However, the adaptive gating mechanism is what prevents this strict concentration from becoming a liability on heterophilic datasets\. The proof of Theorem[2](https://arxiv.org/html/2606.24956#Thmtheorem2)is provided in Appendix[C\.1](https://arxiv.org/html/2606.24956#A3.SS1)\.
### 2\.5Interpretive Analysis
In this section, we analyze the robustness properties of DCQ\-GNN against structural perturbations, followed by an explanation of its performance advantages\. We first demonstrate how the quadratic filter bank intrinsically dampens the impact of adversarial edge injections\.
###### Proposition 3\(Structural Interference Cancellation\)
Let𝐋\\mathbf\{L\}be the symmetrically normalized Laplacian andA^=I−𝐋\\hat\{A\}=I\-\\mathbf\{L\}be the normalized adjacency matrix\. Consider a target nodeuucorrupted by an adversarial edge\(u,v\)\(u,v\)in the regime of homophilic topological noise, where the neighbors’ features of the attackervvsatisfyxt=xv\+ξtx\_\{t\}=x\_\{v\}\+\\xi\_\{t\}with‖ξt‖≤ϵ\\\|\\xi\_\{t\}\\\|\\leq\\epsilonfor allt∈𝒩\(v\)t\\in\\mathcal\{N\}\(v\)\. Assume the degrees ofvvand its neighbors are approximately equal \(dt≈dvd\_\{t\}\\approx d\_\{v\}\)\. Under the simplified concave quadratic filterg\(𝐋\)=2𝐋−𝐋2=I−A^2g\(\\mathbf\{L\}\)=2\\mathbf\{L\}\-\{\\mathbf\{L\}\}^\{2\}=I\-\\hat\{A\}^\{2\}, the direct first\-order adversarial pollutionΔGCN=1dudvxv\\Delta\_\{GCN\}=\\frac\{1\}\{\\sqrt\{d\_\{u\}d\_\{v\}\}\}x\_\{v\}is perfectly neutralized by the second\-order random walk component, yielding a residual representation error strictly bounded by:
‖Err‖≤ϵdudv\.\\\|\\text\{Err\}\\\|\\leq\\epsilon\\sqrt\{\\frac\{d\_\{u\}\}\{d\_\{v\}\}\}\.Consequently, as the local homophily around the attacker increases \(ϵ→0\\epsilon\\rightarrow 0\), the quadratic filter exactly cancels the structural interference\.
Proposition[3](https://arxiv.org/html/2606.24956#Thmtheorem3)reveals that the concave quadratic filter effectively subtracts two\-hop aggregated features from the original node features\. The second\-order termA^2\\hat\{A\}^\{2\}provides a restorative counter\-signal that neutralizes the first\-order malicious injection\. The assumption in Proposition[3](https://arxiv.org/html/2606.24956#Thmtheorem3)represents a worst\-case adversarial scenario where the attacker node lies inside a locally homophilic cluster\. The analysis therefore isolates the structural effect of the quadratic filter rather than modeling arbitrary feature noise\.
Figure 3:Illustration of structural interference cancellation\. While a first\-order GCN propagates adversarial noise from attacker nodevv, the quadratic term in DCQ\-GNN introduces a counteracting signal that helps preserve the feature integrity of the target nodeuu\.#### 2\.5\.1Performance Explanation
The performance advantages of DCQ\-GNN arise from combining filters with complementary curvature polarities, which is described in the following definition\. Combining curvature polarities allows the model to approximate band\-pass behaviors unattainable by identical polarities\. Detailed discussion is presented in Appendix[C\.4](https://arxiv.org/html/2606.24956#A3.SS4)\.
###### Definition 4\(Curvature Polarity\)
For a second\-order spectral filterg\(λ\)g\(\\lambda\), we define its curvature polarity𝒞𝒫\\mathcal\{CP\}as the sign of its second\-order derivative with respect to the eigenvalueλ\\lambda:
𝒞𝒫\(g\):=sgn\(d2g\(λ\)dλ2\)\.\\mathcal\{CP\}\(g\):=\\text\{\\rm sgn\}\\left\(\\frac\{d^\{2\}g\(\\lambda\)\}\{d\\lambda^\{2\}\}\\right\)\.We sayg\(λ\)g\(\\lambda\)has positive polarity \(𝒞𝒫\>0\\mathcal\{CP\}\>0\) if it is strictly convex, and negative polarity \(𝒞𝒫<0\\mathcal\{CP\}<0\) if it is strictly concave\.
As shown in Appendix[C\.4](https://arxiv.org/html/2606.24956#A3.SS4), combining filters with opposing polarities allows DCQ\-GNN to approximate non\-monotonic band\-pass behaviors that are unattainable by identical\-polarity combinations\.
## 3Experiments
We evaluate the proposed DCQ\-GNN on semi\-supervised node classification tasks222Code is available in[github\.com/yanrh1999/DCQ\-GNN](https://arxiv.org/html/2606.24956v1/github.com/yanrh1999/DCQ-GNN)and investigate the following research questions: \(1\)RQ1 \(Expressivity\):Does the proposed convex–concave quadratic filter bank provide stronger generalization capability across the homophily–heterophily spectrum compared to linear and higher\-order polynomial baselines? \(2\)RQ2 \(Robustness\):Does the proposed structural interference cancellation mechanism enhance model stability under adversarial structural perturbations? \(3\)RQ3 \(Ablation Study\):How do different frequency channels \(low\-pass, mid\-pass, high\-pass, and all\-pass\) and the gating mechanism contribute to performance across varying graph structural regimes? \(4\)RQ4 \(Complexity and Performance Analysis\):Why does the quadratic filter bank provide a better efficiency–performance trade\-off than learned high\-order polynomial filters?
### 3\.1Experimental Setup
##### Datasets\.
We evaluate DCQ\-GNN on1010datasets that span a broad homophily–heterophily spectrum, including citation \(CiteSeer, PubMed\), co\-purchase \(Photo, Computers\), and heterophilic graphs \(Texas, Actor, Squirrel, etc\.\)\. Detailed dataset statistics and homophily measures are provided in Appendix[E\.1](https://arxiv.org/html/2606.24956#A5.SS1)\.
##### Baselines\.
Our method is compared against four categories of state\-of\-the\-art GNNs: \(1\) Classical convolution and attention models \(GCN\[[15](https://arxiv.org/html/2606.24956#bib.bib11)\], GAT\[[28](https://arxiv.org/html/2606.24956#bib.bib15)\]\); \(2\) Deep/residual architectures \(JK\-Net\[[31](https://arxiv.org/html/2606.24956#bib.bib16)\], GCNII\[[8](https://arxiv.org/html/2606.24956#bib.bib17)\]\); and \(3\) Representative polynomial/spectral filters \(H2GCN\[[35](https://arxiv.org/html/2606.24956#bib.bib25)\], FAGCN\[[4](https://arxiv.org/html/2606.24956#bib.bib23)\], BernNet\[[12](https://arxiv.org/html/2606.24956#bib.bib19)\], EvenNet\[[16](https://arxiv.org/html/2606.24956#bib.bib20)\], JacobiConv\[[29](https://arxiv.org/html/2606.24956#bib.bib22)\]\), and \(4\) Kernel\-based LHKGNN\[[22](https://arxiv.org/html/2606.24956#bib.bib39)\]\.
##### Experiment settings\.
We follow standard semi\-supervised node classification protocols with60%/20%/20%60\\%/20\\%/20\\%train/validation/test splits333Partitions are generated via simple random sampling \(SRS\)\. We do not enforce class stratification or a minimum sample count per class in the training set\.\. Hyperparameters for all models are optimized using Optuna\[[1](https://arxiv.org/html/2606.24956#bib.bib21)\]under equivalent search budgets to ensure a fair comparison\.
### 3\.2Performance on Node Classification \(RQ1\)
Table 2:Node classification accuracy \(mean±\\pmstandard deviation\) on1010benchmark datasets\. Results are averaged over 10 independent runs\. The best result is highlighted inboldand the second best isunderlined\.DatasetsPhotoPubMedComputersCiteSeerWikiCSAvg\. RankHomophily \(Node\)h=0\.83h=0\.83h=0\.79h=0\.79h=0\.78h=0\.78h=0\.71h=0\.71h=0\.65h=0\.65GCN94\.10±0\.4094\.10\\pm 0\.4088\.53±0\.2888\.53\\pm 0\.2890\.64±0\.3890\.64\\pm 0\.3876\.03±1\.1376\.03\\pm 1\.1382\.02±0\.6982\.02\\pm 0\.699\.0GCNII95\.80±0\.50\\mathbf\{95\.80\\pm 0\.50\}90\.32±0\.30\\mathbf\{90\.32\\pm 0\.30\}92\.19±0\.39\\mathbf\{92\.19\\pm 0\.39\}76\.23±1\.8376\.23\\pm 1\.8384\.24±0\.72¯\\underline\{84\.24\\pm 0\.72\}1\.8\\mathbf\{1\.8\}GAT94\.61±0\.4894\.61\\pm 0\.4888\.12±0\.4888\.12\\pm 0\.4890\.16±0\.8990\.16\\pm 0\.8975\.82±0\.9575\.82\\pm 0\.9584\.23±0\.8284\.23\\pm 0\.827\.4JK\-Net94\.33±0\.5794\.33\\pm 0\.5788\.49±0\.4688\.49\\pm 0\.4691\.50±0\.5591\.50\\pm 0\.5576\.47±1\.2576\.47\\pm 1\.2583\.43±1\.1483\.43\\pm 1\.146\.4H2GCN94\.92±0\.5394\.92\\pm 0\.5389\.52±0\.4789\.52\\pm 0\.4790\.78±0\.5790\.78\\pm 0\.5773\.33±0\.9673\.33\\pm 0\.9683\.40±0\.7683\.40\\pm 0\.767\.2EvenNet94\.32±0\.4394\.32\\pm 0\.4389\.56±0\.4389\.56\\pm 0\.4391\.73±0\.57¯\\underline\{91\.73\\pm 0\.57\}73\.96±1\.1273\.96\\pm 1\.1283\.47±0\.7383\.47\\pm 0\.736\.0FAGCN95\.24±0\.3195\.24\\pm 0\.3189\.21±0\.5589\.21\\pm 0\.5591\.25±0\.7891\.25\\pm 0\.7873\.30±0\.6173\.30\\pm 0\.6183\.58±1\.0383\.58\\pm 1\.036\.2JacobiConv94\.83±0\.2394\.83\\pm 0\.2389\.44±0\.1089\.44\\pm 0\.1091\.20±0\.1891\.20\\pm 0\.1874\.75±1\.2174\.75\\pm 1\.2183\.10±0\.4783\.10\\pm 0\.477\.2BernNet94\.17±0\.4894\.17\\pm 0\.4889\.46±0\.3389\.46\\pm 0\.3389\.56±0\.7889\.56\\pm 0\.7876\.73±0\.97\\mathbf\{76\.73\\pm 0\.97\}83\.80±0\.7583\.80\\pm 0\.756\.2LHK\-GNN95\.65±0\.37¯\\underline\{95\.65\\pm 0\.37\}88\.71±0\.3888\.71\\pm 0\.3891\.27±0\.7291\.27\\pm 0\.7276\.55±1\.16¯\\underline\{76\.55\\pm 1\.16\}83\.56±0\.6483\.56\\pm 0\.644\.4DCQ\-GNN \(ours\)95\.07±0\.5195\.07\\pm 0\.5189\.62±0\.49¯\\underline\{89\.62\\pm 0\.49\}90\.90±0\.5090\.90\\pm 0\.5074\.89±1\.2574\.89\\pm 1\.2585\.50±0\.82\\mathbf\{85\.50\\pm 0\.82\}4\.2¯\\underline\{4\.2\}DatasetsWisconsinActorChameleonSquirrelTexasAvg\. RankHomophily \(Node\)h=0\.17h=0\.17h=0\.15h=0\.15h=0\.10h=0\.10h=0\.08h=0\.08h=0\.06h=0\.06GCN47\.20±5\.6047\.20\\pm 5\.6027\.45±1\.0227\.45\\pm 1\.0241\.34±2\.0841\.34\\pm 2\.0829\.13±1\.6529\.13\\pm 1\.6556\.76±7\.7456\.76\\pm 7\.7410\.6GCNII78\.60±6\.8778\.60\\pm 6\.8737\.43±1\.38\\mathbf\{37\.43\\pm 1\.38\}52\.90±1\.3052\.90\\pm 1\.3036\.73±1\.1836\.73\\pm 1\.1880\.00±7\.0780\.00\\pm 7\.076\.0GAT47\.80±6\.1647\.80\\pm 6\.1629\.14±1\.1829\.14\\pm 1\.1844\.29±2\.4344\.29\\pm 2\.4330\.62±0\.8430\.62\\pm 0\.8455\.68±6\.4255\.68\\pm 6\.429\.8JK\-Net49\.40±4\.5749\.40\\pm 4\.5730\.53±1\.0330\.53\\pm 1\.0342\.29±2\.2042\.29\\pm 2\.2027\.74±1\.8727\.74\\pm 1\.8757\.30±8\.9557\.30\\pm 8\.959\.6H2GCN80\.80±2\.5480\.80\\pm 2\.5434\.76±1\.0634\.76\\pm 1\.0658\.91±1\.0658\.91\\pm 1\.0637\.40±1\.3937\.40\\pm 1\.3974\.47±6\.0474\.47\\pm 6\.046\.8EvenNet81\.80±2\.8281\.80\\pm 2\.8234\.36±0\.6534\.36\\pm 0\.6565\.02±1\.77¯\\underline\{65\.02\\pm 1\.77\}49\.71±0\.8549\.71\\pm 0\.8583\.46±2\.04\\mathbf\{83\.46\\pm 2\.04\}3\.8FAGCN83\.20±4\.40¯\\underline\{83\.20\\pm 4\.40\}32\.62±1\.4432\.62\\pm 1\.4460\.47±3\.0960\.47\\pm 3\.0947\.42±1\.8347\.42\\pm 1\.8380\.25±3\.3280\.25\\pm 3\.325\.4JacobiConv75\.60±5\.4575\.60\\pm 5\.4535\.80±0\.7235\.80\\pm 0\.7266\.34±1\.06\\mathbf\{66\.34\\pm 1\.06\}53\.17±0\.93\\mathbf\{53\.17\\pm 0\.93\}78\.91±3\.7878\.91\\pm 3\.784\.0BernNet85\.20±1\.40\\mathbf\{85\.20\\pm 1\.40\}35\.44±0\.9235\.44\\pm 0\.9261\.19±0\.9461\.19\\pm 0\.9450\.82±0\.6950\.82\\pm 0\.6983\.24±4\.32¯\\underline\{83\.24\\pm 4\.32\}3\.4¯\\underline\{3\.4\}LHK\-GNN81\.40±4\.6481\.40\\pm 4\.6436\.45±1\.29¯\\underline\{36\.45\\pm 1\.29\}64\.58±1\.3764\.58\\pm 1\.3751\.41±0\.6651\.41\\pm 0\.6682\.62±4\.8182\.62\\pm 4\.813\.6DCQ\-GNN \(ours\)82\.20±4\.7782\.20\\pm 4\.7735\.63±1\.3935\.63\\pm 1\.3964\.88±2\.3164\.88\\pm 2\.3152\.26±1\.49¯\\underline\{52\.26\\pm 1\.49\}82\.97±5\.5582\.97\\pm 5\.553\.0\\mathbf\{3\.0\}
Table[2](https://arxiv.org/html/2606.24956#S3.T2)reports node\-classification accuracy on1010datasets covering homophilic and heterophilic regimes\. On homophilic datasets, DCQ\-GNN obtains the second\-best average rank \(4\.24\.2\), outperforming polynomial spectral baselines such as EvenNet \(6\.06\.0\) and BernNet \(6\.26\.2\)\. Although GCNII ranks first \(1\.81\.8\), DCQ\-GNN remains competitive, ranking first on WikiCS and second on PubMed\. These results indicate that curvature\-aware quadratic filtering preserves homophilic smoothing while enabling node\-wise spectral adaptation\.
On heterophilic datasets, DCQ\-GNN matches the best average rank \(3\.03\.0\) and outperforms high\-order spectral baselines including BernNet \(3\.43\.4\), EvenNet \(3\.83\.8\), and JacobiConv \(4\.04\.0\)\. On Squirrel, DCQ\-GNN attains the second\-best accuracy \(52\.26%52\.26\\%\), slightly below JacobiConv \(53\.17%53\.17\\%\)\. This gap suggests that Squirrel may require finer spectral resolution than globally parameterized quadratic channels can provide\. Nevertheless, DCQ\-GNN achieves competitive heterophilic performance with a strictly second\-order formulation, whereas JacobiConv and BernNet rely on higher\-degree polynomial expansions to model complex high\-frequency components\. These results support the efficiency–resolution trade\-off underlying DCQ\-GNN: curvature\-controlled quadratic filters offer robust performance across homophily regimes, while highly irregular datasets may benefit from piecewise or higher\-resolution spectral parameterizations\.
### 3\.3Robustness Analysis \(RQ2\)
Table 3:Accuracy degradation under the DICE structural attack onWikiCS\.↓Δ%\\downarrow\\Delta\\%indicates the relative performance drop compared to the clean graph\.To evaluate robustness against adversarial topological noise, we conduct experiments on the WikiCS dataset \(moderate node homophily,h=0\.65h=0\.65\) to the DICE\[[36](https://arxiv.org/html/2606.24956#bib.bib36)\]structural attack\. Table[3](https://arxiv.org/html/2606.24956#S3.T3)reports the node classification accuracy and the corresponding relative performance degradation under increasing edge perturbation ratios \(1%1\\%,5%5\\%, and10%10\\%\)\.
Across all attack levels, DCQ\-GNN consistently exhibits the highest resilience\. Under the most extreme scenario \(10%10\\%perturbation\), DCQ\-GNN incurs a minimal relative decrease of only9\.26%9\.26\\%\. In contrast, standard spatial methods like GCN and FAGCN suffer severe degradation, dropping by26\.13%26\.13\\%and24\.78%24\.78\\%, respectively\. Furthermore, DCQ\-GNN proves significantly more stable than advanced spectral baselines under the same conditions, outperforming JacobiConv, which experiences a16\.08%16\.08\\%performance drop\. These results validate the structural interference cancellation mechanism\. The concave quadratic component incorporates two\-hop contexts, dampening spurious adversarial edges lacking local support\.
### 3\.4Ablation Study \(RQ3\)
Table 4:Ablation study showing the impact of removing individual channels on the test set\. “w/o” denotes removal of the corresponding component\. Values report accuracy±\\pmstandard deviation, with relative changes in parentheses\.##### Spectral Adaptivity\.
Table[4](https://arxiv.org/html/2606.24956#S3.T4)quantifies the contribution of each spectral channel\. On the homophilic Computers dataset, removing the low\-pass channel causes the largest degradation among channel ablations \(↓8\.89%\\downarrow 8\.89\\%\), indicating that local smoothing remains the dominant signal in this regime\. In contrast, removing the mid\-pass or high\-pass channel yields small gains on Computers \(↑0\.35%\\uparrow 0\.35\\%and↑0\.17%\\uparrow 0\.17\\%\), suggesting that higher\-frequency components may introduce redundant or noisy information on strongly homophilic graphs\. On heterophilic datasets, the high\-pass channel becomes essential: its removal causes the largest channel\-level drops on Wisconsin \(↓13\.38%\\downarrow 13\.38\\%\) and Squirrel \(↓11\.96%\\downarrow 11\.96\\%\)\. These results support the intended role of the filter bank: low\-pass components capture homophilic smoothing, whereas high\-pass components preserve discriminative signals when adjacent nodes often belong to different classes\.
##### Role of the All\-pass Channel\.
The all\-pass channel preserves raw node features and provides a structure\-agnostic reference\. Its removal has almost no effect on Computers \(↓0\.05%\\downarrow 0\.05\\%\), where neighborhood aggregation is reliable, and only a small effect on Squirrel \(↓0\.77%\\downarrow 0\.77\\%\), where other spectral channels can partially compensate\. The largest drop occurs on Wisconsin \(↓10\.46%\\downarrow 10\.46\\%\), indicating that direct feature preservation is critical when topology\-driven aggregation is unstable or weakly aligned with labels\. Thus, the all\-pass channel acts as a fallback pathway, but its contribution depends on the relative informativeness of node attributes and graph structure\.
Figure 4:Distribution of learned filter gate magnitudes on the WikiCS and Texas datasets\.Figure 5:Mean filter gate magnitude across frequency channels on the WikiCS and Texas datasets\.
##### Impact of the Gating Module\.
Removing the gating module degrades performance on all four datasets, with the largest drops on Squirrel \(↓11\.00%\\downarrow 11\.00\\%\) and Computers \(↓7\.90%\\downarrow 7\.90\\%\)\. This confirms that a fixed global mixture of spectral channels is insufficient for graphs with heterogeneous local structures\. On Squirrel, the degradation caused by removing the gate is close to that caused by removing the high\-pass channel \(↓11\.96%\\downarrow 11\.96\\%\), indicating that node\-wise gating is needed to route high\-frequency information effectively\. Figures[4](https://arxiv.org/html/2606.24956#S3.F4)and[5](https://arxiv.org/html/2606.24956#S3.F5)further illustrate this behavior: on the more homophilic WikiCS dataset, the gate assigns larger weights to low\- and all\-pass channels, whereas on the heterophilic Texas dataset, the gate distribution becomes more balanced across frequency channels\. This pattern is consistent with the model’s need to broaden its spectral mixture under stronger heterophily\.
##### Functional Impact of Learnable Filter Parameters\.
The learnable parametersα\\alphaandβ\\betacontrol the cutoff proxy and response scale of the quadratic filters\. Fixing them has limited effect on Computers \(↓0\.86%\\downarrow 0\.86\\%\), Wisconsin \(↓1\.95%\\downarrow 1\.95\\%\), and even improves WikiCS slightly \(↑0\.48%\\uparrow 0\.48\\%\), suggesting that these datasets do not always require fine adjustment of the filter shape\. In contrast, the same fixed\-parameter setting causes a clear drop on Squirrel \(↓6\.68%\\downarrow 6\.68\\%\), where the spectral structure is more irregular\. Thus, learningα\\alphaandβ\\betais most beneficial when the graph requires dataset\-specific spectral boundaries, while its effect can be modest on graphs whose useful frequency bands are easier to separate\.
### 3\.5Complexity and Performance Analysis \(RQ4\)
Table 5:Computational efficiency versus performance on the heterophilicActordataset\. DCQ\-GNN achieves a favorable trade\-off between training time \(ms per epoch\) and accuracy across varying model depthsTT\.To validate the design choice of employing a decoupled quadratic filter bank over high\-order single\-filter methods, we compare the computational efficiency444Runtime evaluations are conducted on a server equipped with an Intel Xeon Platinum 8352V CPU \(12 vCPUs @ 2\.10GHz\) and an NVIDIA L20 GPU\.and performance of DCQ\-GNN against state\-of\-the\-art baselines: BernNet and JacobiConv, on theActordataset\. Table[5](https://arxiv.org/html/2606.24956#S3.T5)summarizes the model size \(parameters\), runtime, and classification accuracy\.
Efficiency vs\. Accuracy:As shown in Table[5](https://arxiv.org/html/2606.24956#S3.T5), whileJacobiConvachieves high accuracy, it incurs a significant computational overhead, requiring over77million parameters and12\.3012\.30ms per epoch due to its complex polynomial basis\. In contrast, DCQ\-GNN \(T=3T=3\) reduces the parameter count by an order of magnitude \(\>10×\>10\\times\) and decreases latency by more than50%50\\%\(5\.885\.88ms vs\.12\.3012\.30ms\), all while maintaining highly competitive accuracy \(35\.47%35\.47\\%vs\.35\.80%35\.80\\%\)\.
Scalability:Notably, DCQ\-GNN remains faster thanJacobiConveven when the filter bank depth is increased toT=5T=5\(9\.909\.90ms\)\. Furthermore, compared toBernNet, DCQ\-GNN \(T=3T=3\) is approximately38%38\\%faster \(5\.885\.88ms vs\.8\.158\.15ms\)\. This demonstrates that our decoupled quadratic formulation effectively aggregates long\-range signals \(theoretically spanning2T2T\-hops\) without the heavy computational burden associated with high\-degree polynomial expansions\.
Robustness to Over\-smoothing:Furthermore, the results demonstrate that DCQ\-GNN effectively mitigates the over\-smoothing problem\. In many spectral GNNs, increasing the propagation depth typically leads to a sharp decline in performance as node representations become indistinguishable\. However, as shown in Table[5](https://arxiv.org/html/2606.24956#S3.T5), DCQ\-GNN maintains stable accuracy as the filter bank depth increases fromT=1T=1toT=5T=5\. Since a depth ofTTtheoretically corresponds to a receptive field of2T2T, our model maintains stable accuracy up to1010\-hop receptive fields, suggesting improved resistance to over\-smoothing\. A discussion of receptive\-field expansion and over\-squashing is provided in Appendix[B\.4](https://arxiv.org/html/2606.24956#A2.SS4)\.
## 4Conclusion and Future Work
This work establishes quadratic spectral filtering as an intermediate design between the limited selectivity of linear filters and the optimization complexity of high\-order polynomials\. We propose DCQ\-GNN, which leverages a compact bank of adaptive convex–concave quadratic filters to achieve sharper spectral shaping and improved noise rejection while maintaining a strictly two\-hop computational footprint\. Our theoretical analysis, grounded in Dirichlet energy and von Neumann entropy, shows that exploiting curvature polarity yields more concentrated spectral responses than first\-order baselines\. Furthermore, we show that the quadratic formulation inherently facilitates structural interference cancellation, providing a second\-order correction mechanism against adversarial edge injections\.
Empirical results show that DCQ\-GNN remains competitive with representative high\-order polynomial spectral filters while improving robustness under structural perturbations and reducing model complexity\. The fixed second\-order parameterization may be less suitable for tasks requiring highly localized, multi\-modal, or non\-polynomial spectral responses\. For irregular spectra such as Squirrel, piecewise quadratic filtering could partition\[0,2\]\[0,2\]into sub\-intervals and learn separate second\-order responses, retaining curvature control while improving spectral localization\. Future work includes robustness bounds under non\-stochastic structural noise and scalable curvature\-aware filter banks\.
## References
- \[1\]T\. Akiba, S\. Sano, T\. Yanase, T\. Ohta, and M\. Koyama\(2019\)Optuna: a next\-generation hyperparameter optimization framework\.InProceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining,pp\. 3023–3031\.Cited by:[§3\.1](https://arxiv.org/html/2606.24956#S3.SS1.SSS0.Px3.p1.1)\.
- \[2\]U\. Alon and E\. Yahav\(2021\)On the bottleneck of graph neural networks and its practical implications\.InInternational Conference on Learning Representations,Cited by:[§B\.4](https://arxiv.org/html/2606.24956#A2.SS4.p1.12)\.
- \[3\]M\. Balcilar, G\. Renton, P\. Héroux, B\. Gaüzère, S\. Adam, and P\. Honeine\(2021\)Analyzing the expressive power of graph neural networks in a spectral perspective\.InInternational Conference on Learning Representations,Cited by:[§F\.1](https://arxiv.org/html/2606.24956#A6.SS1.p2.1)\.
- \[4\]D\. Bo, X\. Wang, C\. Shi, and H\. Shen\(2021\)Beyond low\-frequency information in graph convolutional networks\.InProceedings of the AAAI Conference on Artificial Intelligence,Vol\.35,pp\. 3950–3957\.Cited by:[7th item](https://arxiv.org/html/2606.24956#A5.I4.i7.p1.1),[§F\.3](https://arxiv.org/html/2606.24956#A6.SS3.p1.1),[§3\.1](https://arxiv.org/html/2606.24956#S3.SS1.SSS0.Px2.p1.1)\.
- \[5\]S\. Brin and L\. Page\(1998\)The anatomy of a large\-scale hypertextual web search engine\.Computer Networks and ISDN Systems30\(1\-7\),pp\. 107–117\.Cited by:[§F\.2](https://arxiv.org/html/2606.24956#A6.SS2.p1.1)\.
- \[6\]J\. Bruna, W\. Zaremba, A\. Szlam, and Y\. LeCun\(2014\)Spectral networks and locally connected networks on graphs\.InInternational Conference on Learning Representations,Cited by:[§F\.1](https://arxiv.org/html/2606.24956#A6.SS1.p1.1)\.
- \[7\]B\. Chamberlain, J\. Rowbottom, M\. I\. Gorinova, M\. M\. Bronstein, S\. Webb, and E\. Rossi\(2021\)GRAND: graph neural diffusion\.InInternational Conference on Machine Learning,pp\. 1407–1418\.Cited by:[§F\.4](https://arxiv.org/html/2606.24956#A6.SS4.SSS0.Px1.p1.3)\.
- \[8\]M\. Chen, Z\. Wei, Z\. Huang, B\. Ding, and Y\. Li\(2020\)Simple and deep graph convolutional networks\.InInternational Conference on Machine Learning,pp\. 1725–1735\.Cited by:[2nd item](https://arxiv.org/html/2606.24956#A5.I4.i2.p1.1),[§F\.1](https://arxiv.org/html/2606.24956#A6.SS1.p2.1),[§3\.1](https://arxiv.org/html/2606.24956#S3.SS1.SSS0.Px2.p1.1)\.
- \[9\]E\. Chien, J\. Peng, P\. Li, and O\. Milenkovic\(2021\)Adaptive universal generalized pagerank graph neural network\.InInternational Conference on Learning Representations,Cited by:[§F\.2](https://arxiv.org/html/2606.24956#A6.SS2.p1.1)\.
- \[10\]M\. Defferrard, X\. Bresson, and P\. Vandergheynst\(2016\)Convolutional neural networks on graphs with fast localized spectral filtering\.InAdvances in Neural Information Processing Systems,Vol\.29,pp\. 3844–3852\.Cited by:[§F\.1](https://arxiv.org/html/2606.24956#A6.SS1.p1.1)\.
- \[11\]W\. Hamilton, Z\. Ying, and J\. Leskovec\(2017\)Inductive representation learning on large graphs\.InAdvances in Neural Information Processing Systems,Vol\.30,pp\. 1024–1034\.Cited by:[§F\.1](https://arxiv.org/html/2606.24956#A6.SS1.p2.1)\.
- \[12\]M\. He, Z\. Wei, H\. Xu, Z\. Liu, W\. Ju, X\. Zhang, and C\. Li\(2021\)Bernnet: learning arbitrary graph spectral filters via bernstein approximation\.InAdvances in Neural Information Processing Systems,Vol\.34,pp\. 14239–14251\.Cited by:[9th item](https://arxiv.org/html/2606.24956#A5.I4.i9.p1.1),[§F\.2](https://arxiv.org/html/2606.24956#A6.SS2.p1.1),[§3\.1](https://arxiv.org/html/2606.24956#S3.SS1.SSS0.Px2.p1.1)\.
- \[13\]J\. Huang, L\. Du, X\. Chen, Q\. Fu, S\. Han, and D\. Zhang\(2023\)Robust mid\-pass filtering graph convolutional networks\.InProceedings of the ACM Web Conference,pp\. 328–338\.Cited by:[§F\.3](https://arxiv.org/html/2606.24956#A6.SS3.p2.1),[item 3](https://arxiv.org/html/2606.24956#S2.I2.i3.p1.1)\.
- \[14\]K\. Huang, Y\. G\. Wang, M\. Li, and P\. Lio\(2024\)How universal polynomial bases enhance spectral graph neural networks: heterophily, over\-smoothing, and over\-squashing\.InInternational Conference on Machine Learning,pp\. 20310–20330\.Cited by:[§F\.2](https://arxiv.org/html/2606.24956#A6.SS2.p1.1)\.
- \[15\]T\. N\. Kipf and M\. Welling\(2017\)Semi\-supervised classification with graph convolutional networks\.InInternational Conference on Learning Representations,Cited by:[1st item](https://arxiv.org/html/2606.24956#A5.I4.i1.p1.1),[§F\.1](https://arxiv.org/html/2606.24956#A6.SS1.p1.1),[§3\.1](https://arxiv.org/html/2606.24956#S3.SS1.SSS0.Px2.p1.1)\.
- \[16\]R\. Lei, Z\. Wang, Y\. Li, B\. Ding, and Z\. Wei\(2022\)Evennet: ignoring odd\-hop neighbors improves robustness of graph neural networks\.InAdvances in Neural Information Processing Systems,Vol\.35,pp\. 4694–4706\.Cited by:[6th item](https://arxiv.org/html/2606.24956#A5.I4.i6.p1.1),[§3\.1](https://arxiv.org/html/2606.24956#S3.SS1.SSS0.Px2.p1.1)\.
- \[17\]R\. Levie, F\. Monti, X\. Bresson, and M\. M\. Bronstein\(2019\)Cayleynets: graph convolutional neural networks with complex rational spectral filters\.IEEE Transactions on Signal Processing67\(1\),pp\. 97–109\.Cited by:[§F\.2](https://arxiv.org/html/2606.24956#A6.SS2.p2.1)\.
- \[18\]S\. Luan, C\. h\. Hua, Q\. Lu, J\. Zhu, M\. Zhao, S\. Zhang, X\. Chang, and D\. Precup\(2022\)Revisiting heterophily for graph neural networks\.InAdvances in Neural Information Processing Systems,Vol\.35,pp\. 1362–1375\.Cited by:[§F\.3](https://arxiv.org/html/2606.24956#A6.SS3.p1.1)\.
- \[19\]P\. Mernyei and C\. Cangea\(2020\)Wiki\-cs: a wikipedia\-based benchmark for graph neural networks\.arXiv preprint arXiv:2007\.02901\.Cited by:[3rd item](https://arxiv.org/html/2606.24956#A5.I2.i3.p1.1)\.
- \[20\]S\. Mo, K\. Wu, Q\. Gao, X\. Teng, and J\. Liu\(2025\)AutoSGNN: automatic propagation mechanism discovery for spectral graph neural networks\.InProceedings of the AAAI Conference on Artificial Intelligence,Vol\.39,pp\. 19493–19502\.Cited by:[10th item](https://arxiv.org/html/2606.24956#A5.I4.i10.p1.1)\.
- \[21\]H\. Pei, B\. Wei, K\. C\. C\. Chang, Y\. Lei, and B\. Yang\(2020\)GEOM\-gcn: geometric graph convolutional networks\.InInternational Conference on Learning Representations,Cited by:[1st item](https://arxiv.org/html/2606.24956#A5.I3.i1.p1.1),[3rd item](https://arxiv.org/html/2606.24956#A5.I3.i3.p1.1)\.
- \[22\]T\. Qin, K\. Chen, and Z\. Liu\(2025\)Localized heat kernel for graph neural networks\.InEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases \(ECML PKDD\),pp\. 457–473\.Cited by:[11st item](https://arxiv.org/html/2606.24956#A5.I4.i11.p1.1),[§3\.1](https://arxiv.org/html/2606.24956#S3.SS1.SSS0.Px2.p1.1)\.
- \[23\]L\. Rampášek, M\. Galkin, V\. P\. Dwivedi, A\. T\. Luu, G\. Wolf, and D\. Beaini\(2022\)Recipe for a general, powerful, scalable graph transformer\.InAdvances in Neural Information Processing Systems,Vol\.35,pp\. 14712–14725\.Cited by:[§F\.4](https://arxiv.org/html/2606.24956#A6.SS4.SSS0.Px2.p1.2)\.
- \[24\]B\. Rozemberczki, C\. Allen, and R\. Sarkar\(2021\)Multi\-scale attributed node embedding\.Journal of Complex Networks9\(2\),pp\. cnab014\.Cited by:[2nd item](https://arxiv.org/html/2606.24956#A5.I3.i2.p1.1)\.
- \[25\]T\. K\. Rusch, B\. P\. Chamberlain, J\. Rowbottom, S\. Mishra, and M\. M\. Bronstein\(2022\)Graph\-coupled oscillator networks\.InInternational Conference on Machine Learning,pp\. 18881–18909\.Cited by:[§F\.4](https://arxiv.org/html/2606.24956#A6.SS4.SSS0.Px1.p1.3)\.
- \[26\]O\. Shchur, M\. Mumme, A\. Bojchevski, and S\. Günnemann\(2018\)Pitfalls of graph neural network evaluation\.Cited by:[2nd item](https://arxiv.org/html/2606.24956#A5.I2.i2.p1.1)\.
- \[27\]D\. I\. Shuman, S\. K\. Narang, P\. Frossard, A\. Ortega, and P\. Vandergheynst\(2013\)The emerging field of signal processing on graphs: extending high\-dimensional data analysis to networks and other irregular domains\.IEEE Signal Processing Magazine30\(3\),pp\. 83–98\.Cited by:[§F\.1](https://arxiv.org/html/2606.24956#A6.SS1.p1.1)\.
- \[28\]P\. Veličković, G\. Cucurull, A\. Casanova, A\. Romero, P\. Liò, and Y\. Bengio\(2018\)Graph attention networks\.InInternational Conference on Learning Representations,Cited by:[3rd item](https://arxiv.org/html/2606.24956#A5.I4.i3.p1.1),[§F\.1](https://arxiv.org/html/2606.24956#A6.SS1.p2.1),[§3\.1](https://arxiv.org/html/2606.24956#S3.SS1.SSS0.Px2.p1.1)\.
- \[29\]X\. Wang and M\. Zhang\(2022\)How powerful are spectral graph neural networks\.InInternational Conference on Machine Learning,pp\. 23341–23362\.Cited by:[8th item](https://arxiv.org/html/2606.24956#A5.I4.i8.p1.1),[§F\.2](https://arxiv.org/html/2606.24956#A6.SS2.p1.1),[§3\.1](https://arxiv.org/html/2606.24956#S3.SS1.SSS0.Px2.p1.1)\.
- \[30\]Z\. Wu, S\. Pan, G\. Long, J\. Jiang, and C\. Zhang\(2022\)Beyond low\-pass filtering: graph convolutional networks with automatic filtering\.IEEE Transactions on Knowledge and Data Engineering35\(7\),pp\. 6687–6697\.Cited by:[§F\.3](https://arxiv.org/html/2606.24956#A6.SS3.p1.1)\.
- \[31\]K\. Xu, C\. Li, Y\. Tian, T\. Sonobe, K\. Kawarabayashi, and S\. Jegelka\(2018\)Representation learning on graphs with jumping knowledge networks\.InInternational Conference on Machine Learning,pp\. 5453–5462\.Cited by:[4th item](https://arxiv.org/html/2606.24956#A5.I4.i4.p1.1),[§F\.1](https://arxiv.org/html/2606.24956#A6.SS1.p2.1),[§3\.1](https://arxiv.org/html/2606.24956#S3.SS1.SSS0.Px2.p1.1)\.
- \[32\]Z\. Yang, W\. Cohen, and R\. Salakhutdinov\(2016\)Revisiting semi\-supervised learning with graph embeddings\.InInternational Conference on Machine Learning,pp\. 40–48\.Cited by:[1st item](https://arxiv.org/html/2606.24956#A5.I2.i1.p1.1)\.
- \[33\]C\. Ying, T\. Cai, S\. Luo, S\. Zheng, G\. Ke, D\. He, Y\. Shen, and T\. Liu\(2021\)Do transformers really perform bad for graph representation?\.InAdvances in Neural Information Processing Systems,Vol\.34,pp\. 28877–28888\.Cited by:[§F\.4](https://arxiv.org/html/2606.24956#A6.SS4.SSS0.Px2.p1.2)\.
- \[34\]X\. Zheng, B\. Zhou, J\. Gao, Y\. Wang, P\. Lió, M\. Li, and G\. Montufar\(2021\)How framelets enhance graph neural networks\.InInternational Conference on Machine Learning,pp\. 12761–12771\.Cited by:[§F\.3](https://arxiv.org/html/2606.24956#A6.SS3.p1.1)\.
- \[35\]J\. Zhu, Y\. Yan, L\. Zhao, M\. Heimann, L\. Akoglu, and D\. Koutra\(2020\)Beyond homophily in graph neural networks: current limitations and effective designs\.InAdvances in Neural Information Processing Systems,Vol\.33,pp\. 7793–7804\.Cited by:[5th item](https://arxiv.org/html/2606.24956#A5.I4.i5.p1.1),[§F\.1](https://arxiv.org/html/2606.24956#A6.SS1.p2.1),[§3\.1](https://arxiv.org/html/2606.24956#S3.SS1.SSS0.Px2.p1.1)\.
- \[36\]D\. Zügner, A\. Akbarnejad, and S\. Günnemann\(2018\)Adversarial attacks on neural networks for graph data\.InProceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining,pp\. 2847–2856\.Cited by:[§F\.3](https://arxiv.org/html/2606.24956#A6.SS3.p2.1),[§3\.3](https://arxiv.org/html/2606.24956#S3.SS3.p1.4)\.
## Appendices
## Appendix ANotation
Table[6](https://arxiv.org/html/2606.24956#A1.T6)summarizes the key notations and symbols used throughout this paper\.
Table 6:Key notations used throughout the paper\.
## Appendix BMore Discussion
In this section, we first present a technical comparison between DCQ\-GNN and representative adaptive polynomial spectral methods, including ChebyNet, BernNet, and JacobiConv\. We then analyze the sensitivity of the proposed model and discuss its computational complexity\.
### B\.1Comparison with Polynomial Spectral GNNs
Polynomial spectral GNNs such as ChebyNet, BernNet, and JacobiConv learn spectral filters through basis expansions\. Specifically, these models parameterize the filter as a linear combination of predefined polynomial bases, where the learned coefficients implicitly determine the spectral response\.
In contrast, the proposed DCQ\-GNN does not rely on polynomial basis expansions to approximate arbitrary filters\. Instead, it explicitly constructs a filter bank characterized by curvature polarity\. Each filter is designed to satisfy structural second\-order derivative constraints \(convex, concave, or mid\-pass\), leading to interpretable and stable spectral responses\. This design shifts the focus from polynomial degree to curvature structure, enabling expressive filtering with a small number of parameters\.
### B\.2Sensitivity Analysis
We analyze the stability by considering sensitivity to perturbations in both the spatial and spectral domains\.
Spatial Sensitivity\.Differentiating the quadratic propagation operator with respect to a perturbed edge weightwuvw\_\{uv\}yields
∇wuvhuquad=𝐱v−γ⋅hvlocal,\\nabla\_\{w\_\{uv\}\}h\_\{u\}^\{\\mathrm\{quad\}\}=\\mathbf\{x\}\_\{v\}\-\\gamma\\cdot h\_\{v\}^\{\\mathrm\{local\}\},\(5\)whereγ\\gammais a coefficient derived from the second\-order filter weight,hvlocalh\_\{v\}^\{\\mathrm\{local\}\}denotes the local aggregated representation of nodevvobtained after one propagation step\. If the attacker nodevvis a “bridge” to a distinct community \(heterophilic noise\),hvlocalh\_\{v\}^\{\\mathrm\{local\}\}differs from𝐱v\\mathbf\{x\}\_\{v\}, and the filter maintains high sensitivity\. However, if the perturbation is locally homophilic, the gradient is suppressed\.
Spectral Sensitivity\.Structural perturbations induce shifts in the Laplacian eigenvalues\. We quantify the filter’s sensitivity to these shifts via the derivative magnitude𝒮g\(λ\)=\|g′\(λ\)\|\\mathcal\{S\}\_\{g\}\(\\lambda\)=\|g^\{\\prime\}\(\\lambda\)\|\. For a linear filtergℓ\(λ\)=1−12λg\_\{\\ell\}\(\\lambda\)=1\-\\frac\{1\}\{2\}\\lambda, the sensitivity is constant:𝒮ℓ\(λ\)=0\.5\\mathcal\{S\}\_\{\\ell\}\(\\lambda\)=0\.5\. For our convex high\-pass filtergq\(λ\)=βλ~2λ2g\_\{q\}\(\\lambda\)=\\frac\{\\beta\}\{\{\\tilde\{\\lambda\}\}^\{2\}\}\\lambda^\{2\}, the sensitivity𝒮q\(λ\)=2βλ~2λ\\mathcal\{S\}\_\{q\}\(\\lambda\)=\\frac\{2\\beta\}\{\{\\tilde\{\\lambda\}\}^\{2\}\}\\lambdavanishes asλ→0\\lambda\\to 0\. This implies that the quadratic filter exhibits reduced sensitivity near low\-frequency components of the spectrum\. Conversely, for low\-pass components, the quadratic form provides a steeper “roll\-off” near the cutoff, improving rejection of high\-frequency topology noise\.
### B\.3Computational Complexity
The computational cost of DCQ\-GNN is primarily dominated by the Sparse\-Dense Matrix Multiplications \(SpMM\) required for spectral propagation\. For a graphG=\(𝒱,ℰ\)G=\(\\mathcal\{V\},\\mathcal\{E\}\)withNNnodes,MMedges, and input featuresX∈ℝN×FX\\in\\mathbb\{R\}^\{N\\times F\}, the complexity for a single layer with a filter bank of sizeBBis analyzed as follows: \(1\) Spectral Propagation\. Each quadratic spectral filterg\(𝐋\)=a0I\+a1𝐋\+a2𝐋2g\(\\mathbf\{L\}\)=a\_\{0\}I\+a\_\{1\}\\mathbf\{L\}\+a\_\{2\}\\mathbf\{L\}^\{2\}is implemented via recursive neighborhood aggregation operations\. The second\-order term𝐋2X\\mathbf\{L\}^\{2\}Xis computed as𝐋\(𝐋X\)\\mathbf\{L\}\(\\mathbf\{L\}X\), requiring two SpMM operations\. Thus, forBBfilters, the propagation cost is𝒪\(BMF\)\\mathcal\{O\}\(BMF\)\. \(2\) Adaptive Gating and Fusion\. The node\-wise attention mechanism involves a linear projection to compute scores and a weighted summation ofBBfilter outputs\. This contributes a complexity of𝒪\(BNF\)\\mathcal\{O\}\(BNF\)\. The total complexity per layer is𝒪\(B\(MF\+NF\)\)\\mathcal\{O\}\(B\(MF\+NF\)\), which remains linear with respect to the number of edgesMM\. This ensures that DCQ\-GNN maintains the efficiency of low\-order spatial GNNs while offering enhanced spectral selectivity\.
### B\.4Receptive Field Expansion and Over\-squashing
Over\-squashing\.Over\-squashing\[[2](https://arxiv.org/html/2606.24956#bib.bib44)\]refers to the compression of long\-range signals through narrow graph bottlenecks as propagation depth increases\. In DCQ\-GNN, each quadratic layer uses second\-order Laplacian filtering, whereg\(𝐋\)=a0I\+a1𝐋\+a2𝐋2g\(\\mathbf\{L\}\)=a\_\{0\}I\+a\_\{1\}\\mathbf\{L\}\+a\_\{2\}\\mathbf\{L\}^\{2\}can be equivalently expressed as a weighted combination ofII,A^\\hat\{A\}, andA^2\\hat\{A\}^\{2\}since𝐋=I−A^\\mathbf\{L\}=I\-\\hat\{A\}\. Thus, a single layer aggregates identity, one\-hop, and two\-hop components without stacking two sequential first\-order layers\. For a graph with average degreedd, the two\-hop term can access up to𝒪\(d2\)\\mathcal\{O\}\(d^\{2\}\)local routes, increasing the number of parallel aggregation paths before depth is increased\. This second\-order expansion provides a wider local receptive field than a purely first\-order propagation step while keeping the per\-layer propagation order fixed\. Consequently, increasing the filter bank depthTTexpands the receptive field to at most2T2Thops without relying on a depth\-2T2Tstack of first\-order transformations\. This mechanism is consistent with the stable accuracy observed fromT=1T=1toT=5T=5in Table[5](https://arxiv.org/html/2606.24956#S3.T5)\.
## Appendix CExtended Theoretical Results
### C\.1Proof of Main Results
This section first presents the proofs of Theorems[1](https://arxiv.org/html/2606.24956#Thmtheorem1)and[2](https://arxiv.org/html/2606.24956#Thmtheorem2), together with Proposition[3](https://arxiv.org/html/2606.24956#Thmtheorem3), and then provides additional results on quadratic spectral filter decomposition\.Proof\.\[Proof of Theorem[1](https://arxiv.org/html/2606.24956#Thmtheorem1)\] Letfℓ\(a\)=supλ∈\[λ∗,2\]λ\(1−aλ\)2f\_\{\\ell\}\(a\)=\\sup\_\{\\lambda\\in\[\\lambda^\{\*\},2\]\}\\lambda\(1\-a\\lambda\)^\{2\}, anda∗a^\{\*\}be the optimal coefficient that minimizes this maximum error, defining the optimal linear filter
gℓ∗\(λ\)=1−a∗λ\.g\_\{\\ell\}^\{\*\}\(\\lambda\)=1\-a^\{\*\}\\lambda\.Becausegℓ∗\(λ\)g\_\{\\ell\}^\{\*\}\(\\lambda\)is monotonic and linear, the minimax criterion dictates that its maximum weighted energy, denoted asMLM\_\{L\}, must be balanced at the boundaries of the transition band:
ML=λ∗\(1−a∗λ∗\)2=2\(1−2a∗\)2\.M\_\{L\}=\\lambda^\{\*\}\(1\-a^\{\*\}\\lambda^\{\*\}\)^\{2\}=2\(1\-2a^\{\*\}\)^\{2\}\.To balance these bounds, the filter must cross zero within the interval, implyinggℓ∗\(λ∗\)\>0g\_\{\\ell\}^\{\*\}\(\\lambda^\{\*\}\)\>0andgℓ∗\(2\)<0g\_\{\\ell\}^\{\*\}\(2\)<0\. We now construct a quadratic filtergq\(λ\)∈𝒢Qg\_\{q\}\(\\lambda\)\\in\\mathcal\{G\}\_\{Q\}that strictly reduces this maximum energy\. Define a two\-parameter perturbation:
gq\(λ\)=1−\(a∗\+δ\)λ\+bλ2,g\_\{q\}\(\\lambda\)=1\-\(a^\{\*\}\+\\delta\)\\lambda\+b\\lambda^\{2\},whereδ\>0\\delta\>0andb\>0b\>0are arbitrarily small perturbation constants\. We analyze the filter’s energy at the critical boundaries:
\(1\) At the upper boundaryλ=2\\lambda=2: We require the absolute magnitude of the filter to be strictly reduced:
\|gq\(2\)\|<\|gℓ∗\(2\)\|\.\|g\_\{q\}\(2\)\|<\|g\_\{\\ell\}^\{\*\}\(2\)\|\.Sincegℓ∗\(2\)=1−2a∗<0g\_\{\\ell\}^\{\*\}\(2\)=1\-2a^\{\*\}<0, we substitute our perturbed filter and reverse the signs:
−\(1−2a∗−2δ\+4b\)<−\(1−2a∗\)\-\(1\-2a^\{\*\}\-2\\delta\+4b\)<\-\(1\-2a^\{\*\}\)This simplifies to the condition4b−2δ\>04b\-2\\delta\>0, or equivalently:
\(2\) At the lower boundaryλ=λ∗\\lambda=\\lambda^\{\*\}: Similarly, we require\|gq\(λ∗\)\|<\|gℓ∗\(λ∗\)\|\|g\_\{q\}\(\\lambda^\{\*\}\)\|<\|g\_\{\\ell\}^\{\*\}\(\\lambda^\{\*\}\)\|\. Sincegℓ∗\(λ∗\)\>0g\_\{\\ell\}^\{\*\}\(\\lambda^\{\*\}\)\>0:
1−\(a∗\+δ\)λ∗\+a\(λ∗\)2<1−a∗λ∗\.1\-\(a^\{\*\}\+\\delta\)\\lambda^\{\*\}\+a\(\\lambda^\{\*\}\)^\{2\}<1\-a^\{\*\}\\lambda^\{\*\}\.This simplifies to:
b\(λ∗\)2−δλ∗<0⟹b<δλ∗\.b\(\\lambda^\{\*\}\)^\{2\}\-\\delta\\lambda^\{\*\}<0\\implies b<\\frac\{\\delta\}\{\\lambda^\{\*\}\}\.Becauseλ∗∈\(0,2\)\\lambda^\{\*\}\\in\(0,2\), we have12<1λ∗\\frac\{1\}\{2\}<\\frac\{1\}\{\\lambda^\{\*\}\}\. Therefore, there exists a non\-empty feasible region for our perturbation parameters\. We can cleanly choose a sufficiently smallδ\>0\\delta\>0and setbbsuch that:
δ2<b<δλ∗\.\\frac\{\\delta\}\{2\}<b<\\frac\{\\delta\}\{\\lambda^\{\*\}\}\.By satisfying this condition, we guarantee that the weighted Dirichlet energy is strictly reduced at both boundaries:
λ∗\|gq\(λ∗\)\|2<MLand2\|gq\(2\)\|2<ML\.\\lambda^\{\*\}\|g\_\{q\}\(\\lambda^\{\*\}\)\|^\{2\}<M\_\{L\}\\quad\\text\{and\}\\quad 2\|g\_\{q\}\(2\)\|^\{2\}<M\_\{L\}\.Because the polynomials are continuous and evaluated on a compact interval, a sufficiently small choice ofδ\\deltaensures that no interior pointλ∈\(λ∗,2\)\\lambda\\in\(\\lambda^\{\*\},2\)exceeds the new strictly lowered bounds\.
Furthermore, because the perturbation is arbitrarily small andgq\(0\)=1g\_\{q\}\(0\)=1, the filter naturally respects the stability constraintsupλ∈\[0,2\]\|gq\(λ\)\|≤1\\sup\_\{\\lambda\\in\[0,2\]\}\|g\_\{q\}\(\\lambda\)\|\\leq 1\. Consequently, we achieve:
supλ∈\[λ∗,2\]λ\|gq\(λ\)\|2<ML\.\\sup\_\{\\lambda\\in\[\\lambda^\{\*\},2\]\}\\lambda\|g\_\{q\}\(\\lambda\)\|^\{2\}<M\_\{L\}\.This confirms the existence of a strictly superior quadratic filter, completing the proof\.□\\square
□\\square
Proof\.\[Proof of Theorem[2](https://arxiv.org/html/2606.24956#Thmtheorem2)\] Let the eigenvalues of the normalized Laplacian be ordered0=λ1≤λ2≤⋯≤λN≤20=\\lambda\_\{1\}\\leq\\lambda\_\{2\}\\leq\\dots\\leq\\lambda\_\{N\}\\leq 2\. The normalized power spectral components are:
pi=\(λmax−λi\)2∑k=1N\(λmax−λk\)2,qi=\(λmax−λi\)4∑k=1N\(λmax−λk\)4\.p\_\{i\}=\\frac\{\(\\lambda\_\{\\max\}\-\\lambda\_\{i\}\)^\{2\}\}\{\\sum\_\{k=1\}^\{N\}\(\\lambda\_\{\\max\}\-\\lambda\_\{k\}\)^\{2\}\},\\quad q\_\{i\}=\\frac\{\(\\lambda\_\{\\max\}\-\\lambda\_\{i\}\)^\{4\}\}\{\\sum\_\{k=1\}^\{N\}\(\\lambda\_\{\\max\}\-\\lambda\_\{k\}\)^\{4\}\}\.Define the ratiori=qipi=C⋅\(λmax−λi\)2r\_\{i\}=\\frac\{q\_\{i\}\}\{p\_\{i\}\}=C\\cdot\(\\lambda\_\{\\max\}\-\\lambda\_\{i\}\)^\{2\}, whereCCis the ratio of the normalization constants\. Sinceλmax−λi\\lambda\_\{\\max\}\-\\lambda\_\{i\}is non\-increasing inii, the sequence\{ri\}i=1N\\\{r\_\{i\}\\\}\_\{i=1\}^\{N\}is non\-increasing\.
Notice that∑pi=∑qi=1\\sum p\_\{i\}=\\sum q\_\{i\}=1, the differencedi=qi−pid\_\{i\}=q\_\{i\}\-p\_\{i\}changes sign at least once\. Becauserir\_\{i\}is non\-increasing, there exists an indexk∗k^\{\*\}such that:
- •qi≥piq\_\{i\}\\geq p\_\{i\}fori≤k∗i\\leq k^\{\*\}
- •qi<piq\_\{i\}<p\_\{i\}fori\>k∗i\>k^\{\*\}\.
This single\-crossing property implies that the mass of the distribution𝐪\\mathbf\{q\}is more heavily weighted toward the smaller indices \(lower frequencies\) compared to𝐩\\mathbf\{p\}\.
LetPk=∑i=1kpiP\_\{k\}=\\sum\_\{i=1\}^\{k\}p\_\{i\}andQk=∑i=1kqiQ\_\{k\}=\\sum\_\{i=1\}^\{k\}q\_\{i\}be the cumulative sums\. From the single\-crossing property:
Fork≤k∗k\\leq k^\{\*\},Qk−Pk=∑i=1k\(qi−pi\)≥0Q\_\{k\}\-P\_\{k\}=\\sum\_\{i=1\}^\{k\}\(q\_\{i\}\-p\_\{i\}\)\\geq 0because all terms are non\-negative\.
Fork\>k∗k\>k^\{\*\},Qk−Pk=\(QN−PN\)−∑i=k\+1N\(qi−pi\)Q\_\{k\}\-P\_\{k\}=\(Q\_\{N\}\-P\_\{N\}\)\-\\sum\_\{i=k\+1\}^\{N\}\(q\_\{i\}\-p\_\{i\}\)\.
SinceQN=PN=1Q\_\{N\}=P\_\{N\}=1and\(qi−pi\)<0\(q\_\{i\}\-p\_\{i\}\)<0fori\>k∗i\>k^\{\*\}, the sum∑i=k\+1N\(qi−pi\)\\sum\_\{i=k\+1\}^\{N\}\(q\_\{i\}\-p\_\{i\}\)is negative\. Thus,Qk−Pk≥0Q\_\{k\}\-P\_\{k\}\\geq 0for allkk\. By definition, this satisfies the condition for majorization:𝐪≻𝐩\\mathbf\{q\}\\succ\\mathbf\{p\}\.
Since𝐋\{\\bf L\}\(thusPP\) is diagonal, then the von Neumann entropySVN\(P\)=−∑ηilnηiS\_\{\\mathrm\{VN\}\}\(P\)=\-\\sum\\eta\_\{i\}\\ln\\eta\_\{i\}is a strictly Schur\-concave function\. By the property of Schur\-concavity:
𝐪≻𝐩⟹SVN\(𝐪\)≤SVN\(𝐩\)\.\\mathbf\{q\}\\succ\\mathbf\{p\}\\implies S\_\{\\mathrm\{VN\}\}\(\\mathbf\{q\}\)\\leq S\_\{\\mathrm\{VN\}\}\(\\mathbf\{p\}\)\.The inequality is strict unless𝒫\\mathcal\{P\}is a permutation of𝒬\\mathcal\{Q\}, which, given the monotonicity of the filters, occurs only if the relevant eigenvalues are degenerate\.□\\square□\\square
Proof\.\[Proof of Proposition[3](https://arxiv.org/html/2606.24956#Thmtheorem3)\] Consider the simplified concave quadratic filterg\(𝐋\)=2𝐋−𝐋2g\(\\mathbf\{L\}\)=2\\mathbf\{L\}\-\\mathbf\{L\}^\{2\}\. In terms of the normalized adjacencyA^=I−𝐋\\hat\{A\}=I\-\\mathbf\{L\}, we have:
g\(𝐋\)=2\(I−A^\)−\(I−A^\)2=I−A^2g\(\\mathbf\{L\}\)=2\(I\-\\hat\{A\}\)\-\(I\-\\hat\{A\}\)^\{2\}=I\-\\hat\{A\}^\{2\}To formalize the structural interference cancellation, consider the output representation for a target nodeuu, given byhu=xu−\[A^2X\]uh\_\{u\}=x\_\{u\}\-\[\\hat\{A\}^\{2\}X\]\_\{u\}\. Suppose an adversarial edge\(u,v\)\(u,v\)is injected into the graph to corrupt nodeuu\. In a standard first\-order GCN \(wherehGCN=A^Xh\_\{GCN\}=\\hat\{A\}X\), the direct feature pollution from the adversarial nodevvontouuis given by:
ΔGCN=A^uvxv=1dudvxv\.\\Delta\_\{GCN\}=\\hat\{A\}\_\{uv\}x\_\{v\}=\\frac\{1\}\{\\sqrt\{d\_\{u\}d\_\{v\}\}\}x\_\{v\}\.In DCQ\-GNN, the application ofA^2\\hat\{A\}^\{2\}computes a two\-hop random walk\. The structural interference reachinguualong the adversarial pathvvis:
\[A^2X\]u←v=∑t∈𝒩\(v\)A^uvA^vtxt=1dudv∑t∈𝒩\(v\)1dvdtxt\.\[\\hat\{A\}^\{2\}X\]\_\{u\\leftarrow v\}=\\sum\_\{t\\in\\mathcal\{N\}\(v\)\}\\hat\{A\}\_\{uv\}\\hat\{A\}\_\{vt\}x\_\{t\}=\\frac\{1\}\{\\sqrt\{d\_\{u\}d\_\{v\}\}\}\\sum\_\{t\\in\\mathcal\{N\}\(v\)\}\\frac\{1\}\{\\sqrt\{d\_\{v\}d\_\{t\}\}\}x\_\{t\}\.Assume the adversarial nodevvis embedded in a locally homophilic neighborhood such that its neighbors’ features areϵ\\epsilon\-consistent withxvx\_\{v\}\. Formally, letxt=xv\+ξtx\_\{t\}=x\_\{v\}\+\\xi\_\{t\}, where‖ξt‖≤ϵ\\\|\\xi\_\{t\}\\\|\\leq\\epsilonfor allt∈𝒩\(v\)t\\in\\mathcal\{N\}\(v\)\. Assuming bounded degree variation within the local neighborhood \(i\.e\.,dt∈\[c1dv,c2dv\]d\_\{t\}\\in\[c\_\{1\}d\_\{v\},c\_\{2\}d\_\{v\}\]for constantsc1,c2\>0c\_\{1\},c\_\{2\}\>0\), the adversarial contribution becomes:
\[A^2X\]u←v≈1dudv∑t∈𝒩\(v\)1dv\(xv\+ξt\)=\|𝒩\(v\)\|dvdudvxv\+Err,\[\\hat\{A\}^\{2\}X\]\_\{u\\leftarrow v\}\\approx\\frac\{1\}\{\\sqrt\{d\_\{u\}d\_\{v\}\}\}\\sum\_\{t\\in\\mathcal\{N\}\(v\)\}\\frac\{1\}\{d\_\{v\}\}\(x\_\{v\}\+\\xi\_\{t\}\)=\\frac\{\|\\mathcal\{N\}\(v\)\|\}\{d\_\{v\}\\sqrt\{d\_\{u\}d\_\{v\}\}\}x\_\{v\}\+Err,where the error termErr=1dvdudv∑t∈𝒩\(v\)ξtErr=\\frac\{1\}\{d\_\{v\}\\sqrt\{d\_\{u\}d\_\{v\}\}\}\\sum\_\{t\\in\\mathcal\{N\}\(v\)\}\\xi\_\{t\}\. Since\|𝒩\(v\)\|=dv\|\\mathcal\{N\}\(v\)\|=d\_\{v\}, the primary feature injection term evaluates exactly to1dudvxv\\frac\{1\}\{\\sqrt\{d\_\{u\}d\_\{v\}\}\}x\_\{v\}, which is identical to the first\-order pollutionΔGCN\\Delta\_\{GCN\}\. Because the filter appliesI−A^2I\-\\hat\{A\}^\{2\}, this adversarial signal is explicitly subtracted from the node’s representation\. Furthermore, by the triangle inequality, the residual error is bounded by:
‖Err‖≤1dvdudv∑t∈𝒩\(v\)‖ξt‖≤ϵdudv\.\\\|Err\\\|\\leq\\frac\{1\}\{d\_\{v\}\\sqrt\{d\_\{u\}d\_\{v\}\}\}\\sum\_\{t\\in\\mathcal\{N\}\(v\)\}\\\|\\xi\_\{t\}\\\|\\leq\\frac\{\\epsilon\}\{\\sqrt\{d\_\{u\}d\_\{v\}\}\}\.Thus, as local homophily around the attacker increases \(ϵ→0\\epsilon\\to 0\), the quadratic filter effectively cancels the adversarial injection\.□\\square□\\square
### C\.2Stability Analysis
This section establishes the structural robustness of DCQ\-GNN by characterizing its response under Laplacian perturbations\.
###### Theorem 5\(Spectral Stability Under Laplacian Perturbation\)
Let𝐋\\mathbf\{L\}be the normalized graph Laplacian of a graphGG, and let𝐋~=𝐋\+Δ\\tilde\{\\mathbf\{L\}\}=\\mathbf\{L\}\+\\Deltadenote a perturbed Laplacian where‖Δ‖2≤ϵ\\\|\\Delta\\\|\_\{2\}\\leq\\epsilon\. Consider a quadratic spectral filter
g\(𝐋\)=a0I\+a1𝐋\+a2𝐋2\.g\(\\mathbf\{L\}\)=a\_\{0\}I\+a\_\{1\}\\mathbf\{L\}\+a\_\{2\}\{\\mathbf\{L\}\}^\{2\}\.Then the perturbation in the filtered signal satisfies
‖g\(𝐋~\)x−g\(𝐋\)x‖2≤ϵ\(\|a1\|\+4\|a2\|\+\|a2\|ϵ\)‖x‖2\.\\\|g\(\\tilde\{\\mathbf\{L\}\}\)x\-g\(\\mathbf\{L\}\)x\\\|\_\{2\}\\leq\\epsilon\\left\(\|a\_\{1\}\|\+4\|a\_\{2\}\|\+\|a\_\{2\}\|\\epsilon\\right\)\\\|x\\\|\_\{2\}\.Consequently, the quadratic spectral filtering operator exhibits Lipschitz stability with respect to structural perturbations of the graph topology\.
Proof\.Let
g\(𝐋\)=a0I\+a1𝐋\+a2𝐋2\.g\(\\mathbf\{L\}\)=a\_\{0\}I\+a\_\{1\}\\mathbf\{L\}\+a\_\{2\}\{\\mathbf\{L\}\}^\{2\}\.Evaluating the filter on the perturbed operator yields:
g\(𝐋~\)=a0I\+a1\(𝐋\+Δ\)\+a2\(𝐋\+Δ\)2\.g\(\\tilde\{\\mathbf\{L\}\}\)=a\_\{0\}I\+a\_\{1\}\(\\mathbf\{L\}\+\\Delta\)\+a\_\{2\}\(\\mathbf\{L\}\+\\Delta\)^\{2\}\.Expanding the second\-order term, we have
\(𝐋\+Δ\)2=𝐋2\+𝐋Δ\+Δ𝐋\+Δ2\.\(\\mathbf\{L\}\+\\Delta\)^\{2\}=\{\\mathbf\{L\}\}^\{2\}\+\\mathbf\{L\}\\Delta\+\\Delta\\mathbf\{L\}\+\\Delta^\{2\}\.The operator difference is consequently expressed as:
g\(𝐋~\)−g\(𝐋\)=a1Δ\+a2\(𝐋Δ\+Δ𝐋\)\+a2Δ2\.g\(\\tilde\{\\mathbf\{L\}\}\)\-g\(\\mathbf\{L\}\)=a\_\{1\}\\Delta\+a\_\{2\}\(\\mathbf\{L\}\\Delta\+\\Delta\{\\mathbf\{L\}\}\)\+a\_\{2\}\\Delta^\{2\}\.Taking the spectral norm and using the triangle inequality, we have
‖g\(𝐋~\)−g\(𝐋\)‖2≤\|a1\|‖Δ‖2\+\|a2\|\(‖𝐋Δ‖2\+‖Δ𝐋‖2\)\+\|a2\|‖Δ2‖2\.\\\|g\(\\tilde\{\\mathbf\{L\}\}\)\-g\(\\mathbf\{L\}\)\\\|\_\{2\}\\leq\|a\_\{1\}\|\\\|\\Delta\\\|\_\{2\}\+\|a\_\{2\}\|\(\\\|\\mathbf\{L\}\\Delta\\\|\_\{2\}\+\\\|\\Delta\\mathbf\{L\}\\\|\_\{2\}\)\+\|a\_\{2\}\|\\\|\\Delta^\{2\}\\\|\_\{2\}\.By the submultiplicativity of the operator norm and noting that‖𝐋‖2≤2\\\|\\mathbf\{L\}\\\|\_\{2\}\\leq 2for normalized Laplacians, the individual terms are bounded as‖𝐋Δ‖2≤2ϵ\\\|\\mathbf\{L\}\\Delta\\\|\_\{2\}\\leq 2\\epsilonand‖Δ2‖2≤ϵ2\\\|\\Delta^\{2\}\\\|\_\{2\}\\leq\\epsilon^\{2\}\. Substituting these into the inequality yields:
‖g\(𝐋~\)−g\(𝐋\)‖2≤ϵ\(\|a1\|\+2\|a2\|‖𝐋‖2\+\|a2\|ϵ\)\.\\\|g\(\\tilde\{\\mathbf\{L\}\}\)\-g\(\\mathbf\{L\}\)\\\|\_\{2\}\\leq\\epsilon\\left\(\|a\_\{1\}\|\+2\|a\_\{2\}\|\\\|\\mathbf\{L\}\\\|\_\{2\}\+\|a\_\{2\}\|\\epsilon\\right\)\.Finally, for any signalx∈ℝnx\\in\\mathbb\{R\}^\{n\}, the perturbation error satisfies:
‖g\(𝐋~\)x−g\(𝐋\)x‖2\\displaystyle\\\|g\(\\tilde\{\\mathbf\{L\}\}\)x\-g\(\\mathbf\{L\}\)x\\\|\_\{2\}≤‖g\(𝐋~\)−g\(𝐋\)‖2‖x‖2\\displaystyle\\leq\\\|g\(\\tilde\{\\mathbf\{L\}\}\)\-g\(\\mathbf\{L\}\)\\\|\_\{2\}\\\|x\\\|\_\{2\}≤ϵ\(\|a1\|\+2\|a2\|‖𝐋‖2\+\|a2\|ϵ\)‖x‖2\\displaystyle\\leq\\epsilon\\left\(\|a\_\{1\}\|\+2\|a\_\{2\}\|\\\|\\mathbf\{L\}\\\|\_\{2\}\+\|a\_\{2\}\|\\epsilon\\right\)\\\|x\\\|\_\{2\}≤ϵ\(\|a1\|\+4\|a2\|\+\|a2\|ϵ\)‖x‖2\.\\displaystyle\\leq\\epsilon\\left\(\|a\_\{1\}\|\+4\|a\_\{2\}\|\+\|a\_\{2\}\|\\epsilon\\right\)\\\|x\\\|\_\{2\}\.This completes the proof, confirming that the quadratic filter exhibits bounded output variation under bounded topological perturbations\.□\\square\.□\\square
### C\.3Spectral Selectivity of Quadratic Filters
###### Theorem 6\(Enhanced Spectral Resolution of Quadratic Formulations\)
Let𝐋\\mathbf\{L\}be the normalized graph Laplacian with eigenvaluesλ∈\[0,2\]\\lambda\\in\[0,2\]\. Define a linear filter response asgℓ\(λ\)=a0\+a1λg\_\{\\ell\}\(\\lambda\)=a\_\{0\}\+a\_\{1\}\\lambdaand a quadratic counterpart asgq\(λ\)=a0\+a1λ\+a2λ2g\_\{q\}\(\\lambda\)=a\_\{0\}\+a\_\{1\}\\lambda\+a\_\{2\}\\lambda^\{2\}\. For any non\-zeroa2a\_\{2\}, there exists a non\-empty spectral interval\[λ1,λ2\]⊆\[0,2\]\[\\lambda\_\{1\},\\lambda\_\{2\}\]\\subseteq\[0,2\]such that:
\|gq\(λ1\)−gq\(λ2\)\|\>\|gℓ\(λ1\)−gℓ\(λ2\)\|\.\|g\_\{q\}\(\\lambda\_\{1\}\)\-g\_\{q\}\(\\lambda\_\{2\}\)\|\>\|g\_\{\\ell\}\(\\lambda\_\{1\}\)\-g\_\{\\ell\}\(\\lambda\_\{2\}\)\|\.Thus, quadratic filters can achieve strictly higher spectral selectivity under compatible coefficient polarity conditions, enabling sharper discrimination of high\-frequency components relative to the linear regime\.
This result shows that the second\-order term can enlarge spectral separation on suitable intervals without necessarily amplifying perturbation\-sensitive components\.Proof\.Consider the spectral gap between two frequenciesλ1,λ2\\lambda\_\{1\},\\lambda\_\{2\}\. For the linear filter, the gap isΔgℓ=a1\(λ1−λ2\)\\Delta g\_\{\\ell\}=a\_\{1\}\(\\lambda\_\{1\}\-\\lambda\_\{2\}\)\. For the quadratic case, the gap expands to:
Δgq=a1\(λ1−λ2\)\+a2\(λ12−λ22\)=\(λ1−λ2\)\[a1\+a2\(λ1\+λ2\)\]\.\\Delta g\_\{q\}=a\_\{1\}\(\\lambda\_\{1\}\-\\lambda\_\{2\}\)\+a\_\{2\}\(\\lambda\_\{1\}^\{2\}\-\\lambda\_\{2\}^\{2\}\)=\(\\lambda\_\{1\}\-\\lambda\_\{2\}\)\\left\[a\_\{1\}\+a\_\{2\}\(\\lambda\_\{1\}\+\\lambda\_\{2\}\)\\right\]\.For any pair of non\-trivial frequencies such thatsgn\(a1\)=sgn\(a2\(λ1\+λ2\)\)\\text\{sgn\}\(a\_\{1\}\)=\\text\{sgn\}\(a\_\{2\}\(\\lambda\_\{1\}\+\\lambda\_\{2\}\)\), the absolute difference satisfies:
\|gq\(λ1\)−gq\(λ2\)\|=\|\(λ1−λ2\)\|⋅\|a1\+a2\(λ1\+λ2\)\|\>\|a1\(λ1−λ2\)\|\.\|g\_\{q\}\(\\lambda\_\{1\}\)\-g\_\{q\}\(\\lambda\_\{2\}\)\|=\|\(\\lambda\_\{1\}\-\\lambda\_\{2\}\)\|\\cdot\|a\_\{1\}\+a\_\{2\}\(\\lambda\_\{1\}\+\\lambda\_\{2\}\)\|\>\|a\_\{1\}\(\\lambda\_\{1\}\-\\lambda\_\{2\}\)\|\.This amplification of the spectral distance confirms that the quadratic terma2λ2a\_\{2\}\\lambda^\{2\}increases the filter’s sensitivity to frequency variations, particularly in the high\-frequency tail \(λ≈2\\lambda\\approx 2\) wherea2\(λ1\+λ2\)a\_\{2\}\(\\lambda\_\{1\}\+\\lambda\_\{2\}\)is maximized\. This completes the proof\.□\\square□\\square
### C\.4Curvature\-Based Spectral Expressivity
This section investigates the expressive capacity of convex–concave combinations relative to uniform\-polarity banks\. Based on the curvature polarity \(𝒞𝒫\\mathcal\{CP\}\) introduced in Definition[4](https://arxiv.org/html/2606.24956#Thmtheorem4), we demonstrate that opposing polarities are a necessary condition for spectral flexibility within low\-order constraints\. Consider two convex filtersg1\(λ\)g\_\{1\}\(\\lambda\)andg2\(λ\)g\_\{2\}\(\\lambda\)with𝒞𝒫=1\\mathcal\{CP\}=1, combined asH\(λ\)=w1g1\(λ\)\+w2g2\(λ\)H\(\\lambda\)=w\_\{1\}g\_\{1\}\(\\lambda\)\+w\_\{2\}g\_\{2\}\(\\lambda\), wherew1,w2∈\[0,1\]w\_\{1\},w\_\{2\}\\in\[0,1\]\. Since the second derivative of a conical combination of convex functions remains non\-negative \(H′′\(λ\)≥0H^\{\\prime\\prime\}\(\\lambda\)\\geq 0\), the resulting filter is strictly constrained to the convex functional space\. Consequently, its response is restricted to monotonic or U\-shaped profiles, precluding the modeling of localized mid\-frequency structures common in heterophilic graphs\. In contrast, the proposed convex–concave combinationH\(λ\)=w1glowcvx\(λ\)\+w2ghighccv\(λ\)H\(\\lambda\)=w\_\{1\}g\_\{\\text\{low\}\}^\{\\text\{cvx\}\}\(\\lambda\)\+w\_\{2\}g\_\{\\text\{high\}\}^\{\\text\{ccv\}\}\(\\lambda\)yields a second derivative:
H′′\(λ\)=2\(w1−w2\)βλ~2\.H^\{\\prime\\prime\}\(\\lambda\)=\\frac\{2\(w\_\{1\}\-w\_\{2\}\)\\beta\}\{\{\\tilde\{\\lambda\}\}^\{2\}\}\.Analytic derivation reveals that the curvature polarity ofH\(λ\)H\(\\lambda\)is continuously steerable across the set\{−1,0,1\}\\\{\-1,0,1\\\}by modulating the gating weightsw1,w2w\_\{1\},w\_\{2\}\. Unlike uniform\-polarity filters, this configuration facilitates a band\-pass response when the first and second derivatives vanish at critical spectral points—a behavior mathematically unattainable via convex combinations alone\. Consequently, the convex–concave pair serves as a flexible generator for second\-order spectral responses, capturing the full spectrum of homophilic and heterophilic signals within a compact polynomial family\.
Figure 6:Spectral curvature analysis under different polarity coefficients\. The sign of the quadratic coefficient \(a2a\_\{2\}\) determines the curvature of the spectral response\. Pairing convex and concave filters enables complementary low\- and high\-frequency modeling within a single layer\.The following result formally characterizes this spectral bias through the lens of second\-order derivatives \(see Fig\.[6](https://arxiv.org/html/2606.24956#A3.F6)\)\.
###### Theorem 7\(Spectral Curvature Theorem\)
Let𝐋\{\\mathbf\{L\}\}be the normalized graph Laplacian with eigenvaluesλ∈\[0,2\]\\lambda\\in\[0,2\]\. Consider a quadratic spectral filterg\(λ\)=a0\+a1λ\+a2λ2g\(\\lambda\)=a\_\{0\}\+a\_\{1\}\\lambda\+a\_\{2\}\\lambda^\{2\}with curvatureg′′\(λ\)=2a2g^\{\\prime\\prime\}\(\\lambda\)=2a\_\{2\}\. The spectral bias is determined as follows:
- •Ifa2\>0a\_\{2\}\>0\(convexity\), the filter exhibits high\-frequency amplification relative to the low\-frequency regime\.
- •Ifa2<0a\_\{2\}<0\(concavity\), the filter prioritizes low\-frequency components by attenuating the high\-frequency gain\.
Consequently, a convex–concave pair constitutes a complementary spectral decomposition for the simultaneous capture of multi\-scale graph signals\.
Here,g\(λ\)g\(\\lambda\)denotes the final spectral response defined in Eqs\. \([1](https://arxiv.org/html/2606.24956#S2.E1)\)–\([4](https://arxiv.org/html/2606.24956#S2.E4)\)\. The curvature polarity is positive for convex filters and negative for concave filters, which determines how aggressively the filter attenuates frequencies across the spectrum\. In our framework, the coefficientsaia\_\{i\}\(i∈\{0,1,2\}i\\in\\\{0,1,2\\\}\) are parameterized by the learnable cutoffα\\alphaand scaleβ\\beta, allowing the spectral selectivity to remain adaptive\. Theorem[7](https://arxiv.org/html/2606.24956#Thmtheorem7)further establishes that the curvature coefficient governs the spectral bias: convex filters tend to emphasize high\-frequency components, whereas concave filters favor low\-frequency components, enabling complementary frequency decomposition\.Proof\.Letλ1<λ2\\lambda\_\{1\}<\\lambda\_\{2\}be two arbitrary eigenvalues\. The relative amplification is given byΔg=g\(λ2\)−g\(λ1\)\\Delta g=g\(\\lambda\_\{2\}\)\-g\(\\lambda\_\{1\}\)\. Substituting the quadratic form:
Δg=a1\(λ2−λ1\)\+a2\(λ22−λ12\)=\(λ2−λ1\)\[a1\+a2\(λ1\+λ2\)\]\.\\Delta g=a\_\{1\}\(\\lambda\_\{2\}\-\\lambda\_\{1\}\)\+a\_\{2\}\(\\lambda\_\{2\}^\{2\}\-\\lambda\_\{1\}^\{2\}\)=\(\\lambda\_\{2\}\-\\lambda\_\{1\}\)\\left\[a\_\{1\}\+a\_\{2\}\(\\lambda\_\{1\}\+\\lambda\_\{2\}\)\\right\]\.Sinceλ1\+λ2\>0\\lambda\_\{1\}\+\\lambda\_\{2\}\>0for all non\-trivial eigenvalues, the sign of the second\-order coefficienta2a\_\{2\}governs the monotonicity of the gain relative to the frequency magnitude\. Specifically,a2\>0a\_\{2\}\>0induces a positive curvature that scales the gain withλ\\lambda, whereasa2<0a\_\{2\}<0imposes a concave penalty on higher frequencies\. This proves that curvature polarity is the mechanical driver of spectral bias\.□\\square□\\square
In fact, any second\-order spectral filter can be decomposed into the combination of convex and concave filters, which is presented in the following theorem\.
###### Theorem 8\(Convex–Concave Decomposition of Quadratic Filters\)
Let𝒬2=\{q\(λ\)=a0\+a1λ\+a2λ2:a0,a1,a2∈ℝ\}\\mathcal\{Q\}\_\{2\}=\\\{q\(\\lambda\)=a\_\{0\}\+a\_\{1\}\\lambda\+a\_\{2\}\\lambda^\{2\}:a\_\{0\},a\_\{1\},a\_\{2\}\\in\\mathbb\{R\}\\\}denote the space of second\-order polynomials on the interval\[0,2\]\[0,2\]Define the polar sub\-spaces:
𝒬cvx=\{q∈𝒬2:a2\>0\},𝒬ccv=\{q∈𝒬2:a2<0\}\.\\mathcal\{Q\}\_\{cvx\}=\\\{q\\in\\mathcal\{Q\}\_\{2\}:a\_\{2\}\>0\\\},\\qquad\\mathcal\{Q\}\_\{ccv\}=\\\{q\\in\\mathcal\{Q\}\_\{2\}:a\_\{2\}<0\\\}\.Every quadratic filterq∈𝒬2q\\in\\mathcal\{Q\}\_\{2\}admits a decompositionq=qcvx\+qccvq=q\_\{cvx\}\+q\_\{ccv\}, whereqcvx∈𝒬cvxq\_\{cvx\}\\in\\mathcal\{Q\}\_\{cvx\}andqccv∈𝒬ccvq\_\{ccv\}\\in\\mathcal\{Q\}\_\{ccv\}\. This establishes that the space of quadratic filters𝒬2\\mathcal\{Q\}\_\{2\}is the sum of the convex and concave quadratic cones,𝒬2=𝒬cvx\+𝒬ccv\\mathcal\{Q\}\_\{2\}=\\mathcal\{Q\}\_\{cvx\}\+\\mathcal\{Q\}\_\{ccv\}\.
This result establishes that the proposed convex–concave architecture is not merely an intuitive heuristic, but a two\-branch convex–concave decomposition for second\-order spectral filters\.Proof\.For anyq\(λ\)=a0\+a1λ\+a2λ2q\(\\lambda\)=a\_\{0\}\+a\_\{1\}\\lambda\+a\_\{2\}\\lambda^\{2\}, select anyδ\>\|a2\|\\delta\>\|a\_\{2\}\|and define:qcvx\(λ\)=δλ2q\_\{cvx\}\(\\lambda\)=\\delta\\lambda^\{2\}andqccv\(λ\)=a0\+a1λ\+\(a2−δ\)λ2q\_\{ccv\}\(\\lambda\)=a\_\{0\}\+a\_\{1\}\\lambda\+\(a\_\{2\}\-\\delta\)\\lambda^\{2\}\. By construction,δ\>0\\delta\>0ensuresqcvx∈𝒬cvxq\_\{cvx\}\\in\\mathcal\{Q\}\_\{cvx\}, whilea2−δ<0a\_\{2\}\-\\delta<0guaranteesqccv∈𝒬ccvq\_\{ccv\}\\in\\mathcal\{Q\}\_\{ccv\}\. The identityq=qcvx\+qccvq=q\_\{cvx\}\+q\_\{ccv\}holds trivially\.□\\square□\\square
##### Theoretical Implication\.
The presented analysis establishes three properties of DCQ\-GNN:
- •quadratic propagation increases spectral selectivity relative to linear filters;
- •curvature polarity induces complementary frequency responses;
- •the operator remains stable under bounded Laplacian perturbations\.
Together, these results explain why DCQ\-GNN can simultaneously capture low\- and high\-frequency graph signals while maintaining structural robustness\.
## Appendix DAlgorithmic Details
Algorithm 1Overall pipeline of the DCQ\-GNN framework\.0:Graph Laplacian
𝐋∈ℝN×N\\mathbf\{L\}\\in\\mathbb\{R\}^\{N\\times N\}, Node features
X∈ℝN×FX\\in\\mathbb\{R\}^\{N\\times F\}
0:Output embedding
Hout∈ℝN×F′H\_\{\\text\{out\}\}\\in\\mathbb\{R\}^\{N\\times F^\{\\prime\}\}
1:Step 1: Second\-Order Recursive Propagation
2:
𝐙0←X\\mathbf\{Z\}\_\{0\}\\leftarrow X
3:
𝐙1←𝐋𝐙0\\mathbf\{Z\}\_\{1\}\\leftarrow\\mathbf\{L\}\\mathbf\{Z\}\_\{0\}\{One\-hop propagation\}
4:
𝐙2←𝐋𝐙1\\mathbf\{Z\}\_\{2\}\\leftarrow\\mathbf\{L\}\\mathbf\{Z\}\_\{1\}\{Two\-hop propagation\}
5:Step 2: Construction of Quadratic Filter Bank
6:
λ~←2−α\\tilde\{\\lambda\}\\leftarrow 2\-\\alpha\{Cutoff proxy\}
7:
c←β/λ~2c\\leftarrow\\beta/\\tilde\{\\lambda\}^\{2\}\{Magnitude normalization\}
8:iflow\-pass curvature is convexthen
9:
Hlow←c\(𝐙2−2λ~𝐙1\+λ~2𝐙0\)H\_\{\\text\{low\}\}\\leftarrow c\\big\(\\mathbf\{Z\}\_\{2\}\-2\\tilde\{\\lambda\}\\mathbf\{Z\}\_\{1\}\+\\tilde\{\\lambda\}^\{2\}\\mathbf\{Z\}\_\{0\}\\big\)
10:else
11:
Hlow←−c\(𝐙2−λ~2𝐙0\)H\_\{\\text\{low\}\}\\leftarrow\-c\\big\(\\mathbf\{Z\}\_\{2\}\-\\tilde\{\\lambda\}^\{2\}\\mathbf\{Z\}\_\{0\}\\big\)
12:endif
13:
Hmid←−4c\(𝐙2−λ~𝐙1\)H\_\{\\text\{mid\}\}\\leftarrow\-4c\\big\(\\mathbf\{Z\}\_\{2\}\-\\tilde\{\\lambda\}\\mathbf\{Z\}\_\{1\}\\big\)
14:ifhigh\-pass curvature is convexthen
15:
Hhigh←c𝐙2H\_\{\\text\{high\}\}\\leftarrow c\\,\\mathbf\{Z\}\_\{2\}
16:else
17:
Hhigh←−c\(𝐙2−2λ~𝐙1\)H\_\{\\text\{high\}\}\\leftarrow\-c\\big\(\\mathbf\{Z\}\_\{2\}\-2\\tilde\{\\lambda\}\\mathbf\{Z\}\_\{1\}\\big\)
18:endif
19:
Hall←𝐙0H\_\{\\text\{all\}\}\\leftarrow\\mathbf\{Z\}\_\{0\}\{All\-pass residual anchor\}
20:Step 3: Channel\-wise Linear Projection
21:for
k∈\{low,mid,high,all\}k\\in\\\{\\text\{low\},\\text\{mid\},\\text\{high\},\\text\{all\}\\\}do
22:
H~k←Hk𝚯k\\tilde\{H\}\_\{k\}\\leftarrow H\_\{k\}\\mathbf\{\\Theta\}\_\{k\}
23:endfor
24:Step 4: Node\-Adaptive Gated Fusion
25:
Hconcat←\[Hlow‖Hmid‖Hhigh∥Hall\]H\_\{\\text\{concat\}\}\\leftarrow\[H\_\{\\text\{low\}\}\\\|H\_\{\\text\{mid\}\}\\\|H\_\{\\text\{high\}\}\\\|H\_\{\\text\{all\}\}\]
26:
S←softmax\(HconcatWgate\+𝐛gate\)S\\leftarrow\\mathrm\{softmax\}\(H\_\{\\text\{concat\}\}W\_\{\\text\{gate\}\}\+\\mathbf\{b\}\_\{\\text\{gate\}\}\)
27:
Hout←∑kSk⊙H~kH\_\{\\text\{out\}\}\\leftarrow\\sum\_\{k\}S\_\{k\}\\odot\\tilde\{H\}\_\{k\}
28:
Hout←σ\(Hout\)H\_\{\\text\{out\}\}\\leftarrow\\sigma\(H\_\{\\text\{out\}\}\)
29:return
HoutH\_\{\\text\{out\}\}
The overall pipeline of DCQ\-GNN is summarized in Algorithm[1](https://arxiv.org/html/2606.24956#alg1)\. The quadratic filter responses are computed via second\-order recursive Laplacian propagation, thereby avoiding explicit polynomial expansion or eigendecomposition\. Each propagation step consists of a sparse–dense matrix multiplication\.
## Appendix EDetailed Experiments
This appendix provides supplementary experimental details, including dataset statistics \(Table[7](https://arxiv.org/html/2606.24956#A5.T7)\), baseline descriptions, and hyperparameter configurations\. In addition, we extend the empirical study to further analyze robustness under adversarial perturbations, spectral adaptivity across structural regimes, and performance under class\-balanced training protocols\. Beyond the primary results reported in the main text, we address the following additional research questions:
- •RQ2 Extension \(Robustness Analysis\):Does the proposed structural interference correction mechanism remain effective under adversarial topology attacks on strongly homophilic graphs \(e\.g\.,PubMed\)?
- •RQ5 \(Spectral Selectivity and Adaptivity\):How do the learned spectral responses of the DCQ\-GNN filter bank adapt across homophilic and heterophilic graph regimes compared to standard polynomial filters?
- •RQ6 \(Evaluation under Dense Splits\):How does DCQ\-GNN perform relative to recent spectral baselines when evaluated under a class\-balanced \(dense split\) training protocol?
### E\.1Datasets
Table 7:Summary of dataset statistics and structural properties\. Datasets are grouped into homophilic and heterophilic regimes\. Reported metrics include the number of nodes\|𝒱\|\|\\mathcal\{V\}\|, edges\|ℰ\|\|\\mathcal\{E\}\|, feature dimensionFF, number of classes, assortativity, and three homophily measures \(hnodeh\_\{\\text\{node\}\},hedgeh\_\{\\text\{edge\}\},hclassh\_\{\\text\{class\}\}\)\.We evaluate the proposed model on1010benchmark datasets spanning a broad range of structural properties\. Consistent with the main text, datasets are categorized according to homophily level: \(i\) homophilic graphs, where edges predominantly connect nodes of the same class, and \(ii\) heterophilic graphs, where edges frequently connect nodes of different classes\.
##### Homophilic Datasets\.
- •CiteSeer & PubMed\[[32](https://arxiv.org/html/2606.24956#bib.bib8)\]: Standard citation networks evaluated using the Planetoid experimental protocol\. Nodes represent scientific documents linked by citation edges, with features represented as bag\-of\-words vectors\. The task entails multi\-class node classification across distinct research topics\.
- •Amazon Computers & Photo\[[26](https://arxiv.org/html/2606.24956#bib.bib6)\]: Subgraphs extracted from the Amazon co\-purchase network\. Nodes correspond to products, while edges denote frequent co\-purchase relations\. Node features are encoded from review text to facilitate product category classification\.
- •WikiCS\[[19](https://arxiv.org/html/2606.24956#bib.bib5)\]: A semi\-supervised benchmark comprising Wikipedia articles related to computer science\. Nodes represent articles connected by hyperlinks\. The dataset is characterized by multiple predefined training splits to ensure evaluation stability in node classification\.
##### Heterophilic Datasets\.
- •WebKB \(Texas, Wisconsin\)\[[21](https://arxiv.org/html/2606.24956#bib.bib28)\]: Hyperlink networks of university computer science departments\. Nodes represent webpages and edges denote hyperlinks\. Features are bag\-of\-words representations\. The task is page classification into categories such as faculty, student, and course\.
- •WikipediaNetwork \(Chameleon, Squirrel\)\[[24](https://arxiv.org/html/2606.24956#bib.bib7)\]: Page\-to\-page hyperlink graphs from Wikipedia\. Node features are derived from textual content\. The target variable is discretized page traffic\.
- •Actor\[[21](https://arxiv.org/html/2606.24956#bib.bib28)\]: The actor\-induced subgraph of a film collaboration network\. Nodes represent actors and edges capture co\-occurrence relationships\. The task is categorical node classification\.
These datasets collectively enable evaluation under varying spectral regimes, including low\-frequency\-dominant \(homophilic\) and high\-frequency\-dominant \(heterophilic\) settings\.
### E\.2Baselines
We compare DCQ\-GNN against a diverse set of spatial and spectral graph neural networks:
- •GCN\[[15](https://arxiv.org/html/2606.24956#bib.bib11)\]: A first\-order low\-pass spectral approximation with propagation ruleg\(λ\)=1−λg\(\\lambda\)=1\-\\lambda\.
- •GCNII\[[8](https://arxiv.org/html/2606.24956#bib.bib17)\]: Incorporates initial residual connections and identity mapping to alleviate over\-smoothing in deep architectures\.
- •GAT\[[28](https://arxiv.org/html/2606.24956#bib.bib15)\]: Introduces attention\-based anisotropic aggregation\.
- •JK\-Net\[[31](https://arxiv.org/html/2606.24956#bib.bib16)\]: Aggregates multi\-scale representations through jump connections\.
- •H2GCN\[[35](https://arxiv.org/html/2606.24956#bib.bib25)\]: Explicitly models higher\-order neighborhoods and ego\-feature separation, designed for heterophilic settings\.
- •EvenNet\[[16](https://arxiv.org/html/2606.24956#bib.bib20)\]: Employs even\-degree polynomial filters to achieve spectral symmetry\.
- •FAGCN\[[4](https://arxiv.org/html/2606.24956#bib.bib23)\]: Integrates low\- and high\-frequency components through adaptive gating\.
- •JacobiConv\[[29](https://arxiv.org/html/2606.24956#bib.bib22)\]: A high\-order polynomial spectral model using Jacobi basis functions\.
- •BernNet\[[12](https://arxiv.org/html/2606.24956#bib.bib19)\]: Utilizes Bernstein polynomial bases for flexible spectral approximation\.
- •AutoSGNN\[[20](https://arxiv.org/html/2606.24956#bib.bib38)\]: An automated framework combining large language models and evolutionary search for propagation design\.
- •LHK\-GNN\[[22](https://arxiv.org/html/2606.24956#bib.bib39)\]: Introduces localized heat kernel filtering to address over\-smoothing\.
This selection covers first\-order spatial models, attention\-based methods, polynomial spectral filters, and kernel\-based approaches\.
### E\.3Hyperparameter Optimization\.
We employ the Optuna framework for automated hyperparameter tuning, with the primary objective of maximizing validation accuracy\. Each dataset is allocated100100trials subject to a maximum time budget of3,6003,600seconds\. To ensure empirical rigor, all models undergo the same search protocol\. In instances where official implementations provided dataset\-specific tuned parameters, those configurations are adopted to ensure a competitive and fair baseline performance\.
To reduce computational overhead, we employ Optuna’sMedianPrunerwith 10 startup trials, a 50\-epoch warm\-up period, and evaluation intervals of 10 epochs\.
- •General Training Parameters \(Shared across all tuned models\): - –Learning rate:Log\-uniform distribution∈\[1e\-4,1e\-2\]\\in\[1\\text\{e\-\}4,1\\text\{e\-\}2\]\. - –Weight decay:Log\-uniform distribution∈\[1e\-6,1e\-2\]\\in\[1\\text\{e\-\}6,1\\text\{e\-\}2\]\. - –Dropout rate:Uniform distribution∈\[0\.0,0\.9\]\\in\[0\.0,0\.9\]\.
- •DCQ\-GNN Architecture\-Specific Parameters: - –Number of layers:Integer∈\[1,3\]\\in\[1,3\]\. - –Low\-pass filter curvature:Categorical∈\{Convex \(cvx\),Concave \(ccv\)\}\\in\\\{\\text\{Convex \(cvx\)\},\\text\{Concave \(ccv\)\}\\\}, tuned independently for each layer\. - –High\-pass filter curvature:Categorical∈\{Convex \(cvx\),Concave \(ccv\)\}\\in\\\{\\text\{Convex \(cvx\)\},\\text\{Concave \(ccv\)\}\\\}, tuned independently for each layer\.
### E\.4Sensitivity toα\\alphaandβ\\beta
The parameterα\\alphacontrols the cutoff proxyλ~=2−α\\tilde\{\\lambda\}=2\-\\alpha, whileβ\\betarescales the quadratic response subject to\|g\(λ\)\|≤1\|g\(\\lambda\)\|\\leq 1\. Fig\.[7](https://arxiv.org/html/2606.24956#A5.F7)visualizes how these parameters modulate the filter profiles\.
Figure 7:Effect of parameters on spectral response\. \(Top\)α\\alphashifts the transition frequency \(cutoff\), while \(Bottom\)β\\betacontrols the response magnitude\. Convex \(solid\) and concave \(dashed\) filters provide complementary spectral sharpness at the same polynomial order\.#### E\.4\.1Parameter\-Specific Weight Decay\.
To avoid unintended regularization of spectral shape parameters, weight decay is applied only to standard network weights\. Learnable spectral parameters \(cutoff proxiesα\\alphaand magnitude scalarsβ\\beta\), bias terms, and normalization parameters are excluded from weight decay \(decay rate set to 0\.0\)\. This design ensures that curvature parameters are not artificially suppressed during optimization\.
Figure 8:Training, validation, and test cross\-entropy loss trajectories under two cutoff initializations,α=0\\alpha=0\(λ~=2\)\(\\tilde\{\\lambda\}=2\)andα=1\\alpha=1\(λ~=1\)\(\\tilde\{\\lambda\}=1\), on Actor and WikiCS datasets\.##### Convergence Stability Analysis\.
To examine optimization stability, we track training\-loss trajectories under two initialization settings,α=0\\alpha=0\(λ~=2\)\(\\tilde\{\\lambda\}=2\)andα=1\\alpha=1\(λ~=1\)\(\\tilde\{\\lambda\}=1\), as shown in Figure[8](https://arxiv.org/html/2606.24956#A5.F8)\. On WikiCS, the two trajectories are nearly overlapping, decrease smoothly, and stabilize after approximately 400–500 epochs, indicating that optimization is insensitive to the initial cutoff proxy on this homophilic graph\. On Actor, the two settings also follow similar trajectories, with the cross\-entropy loss decreasing rapidly in the early stage and increasing after about 50 epochs\. This pattern suggests that validation loss is not always aligned with final classification accuracy on heterophilic graphs, where confidence calibration and decision\-boundary refinement may evolve differently\. We therefore select checkpoints by validation accuracy rather than validation loss, which is consistent with the node\-classification objective used in our evaluation\.
Table 8:Node classification accuracy under different initializations of the filter cutoff parameterα\\alpha\. Relative changes are computed against the original initialization\.
##### Initialization Sensitivity of Accuracy\.
Although the loss curves remain stable under both initializations, Table[8](https://arxiv.org/html/2606.24956#A5.T8)shows that classification accuracy exhibits dataset\-dependent sensitivity\. On WikiCS, changing the initialization fromα=0\\alpha=0toα=1\\alpha=1causes only a0\.77%0\.77\\%relative drop, suggesting that the homophilic smoothing signal is robust to the initial cutoff proxy\. On Actor, the same change leads to a larger decrease of6\.76%6\.76\\%\. Since heterophilic graphs rely more heavily on non\-low\-frequency components, initializing withλ~=2\\tilde\{\\lambda\}=2provides a broader initial spectral range and facilitates the subsequent adaptation of high\-frequency channels\.
### E\.5Robustness under Adversarial Perturbations \(RQ2 Extension\)
Table 9:Node classification accuracy under the DICE structural attack onPubMed\.↓Δ%\\downarrow\\Delta\\%denotes the relative accuracy drop from the clean graph\. The best rounded accuracy at each perturbation ratio is highlighted in bold\.We further evaluate robustness on PubMed under DICE structural perturbations\. PubMed is a homophilic graph \(h=0\.79h=0\.79\), so heterophilic edge perturbations directly disrupt the label\-consistent neighborhood structure\. As shown in Table[9](https://arxiv.org/html/2606.24956#A5.T9), first\-order spatial models degrade rapidly: at the20%20\\%perturbation ratio, GCN and GAT drop by23\.82%23\.82\\%and21\.65%21\.65\\%, respectively\. Polynomial spectral baselines are more stable, with BernNet and JacobiConv decreasing by7\.90%7\.90\\%and7\.67%7\.67\\%\. DCQ\-GNN obtains the smallest degradation at20%20\\%, retaining85\.63%85\.63\\%accuracy with a relative drop of4\.45%4\.45\\%\. At5%5\\%, DCQ\-GNN ties JacobiConv in rounded accuracy \(88\.37%88\.37\\%\), while at10%10\\%and20%20\\%it achieves the best accuracy among all compared methods\. These results support the robustness of curvature\-controlled quadratic filtering under adversarial topology modification in homophilic graphs\.
### E\.6Spectral Visualization \(RQ5\)
Figure 9:Learned layer\-wise spectral responses of DCQ\-GNN compared with second\-order \(K=2K=2\) polynomial baselines on homophilic \(PubMed, WikiCS\) and heterophilic \(Actor, Chameleon\) graphs\. Shaded regions denote the explicit frequency channels learned by DCQ\-GNN\.To further analyze the spectral behavior of the proposed model across varying homophily regimes, we visualize the learned spectral responses of the DCQ\-GNN filter bank alongside established polynomial baselines in Fig\.[9](https://arxiv.org/html/2606.24956#A5.F9)\. To ensure a fair comparison in terms of model complexity, BernNet and JacobiConv are restricted to second\-order polynomial capacity \(K=2K=2\), matching the quadratic formulation of DCQ\-GNN\.
As shown in Fig\.[9](https://arxiv.org/html/2606.24956#A5.F9), DCQ\-GNN exhibits topology\-dependent spectral adaptation\. In homophilic graphs \(PubMed and WikiCS\), the learned response is dominated by the convex low\-pass component, emphasizing low\-frequency smoothing, while the high\-pass channel displays limited activation\. In contrast, for heterophilic graphs \(Actor and Chameleon\), higher\-frequency components become more pronounced, indicating increased reliance on disassortative structural signals\.
The visualization also reveals the behavior of standard polynomial approximations under low\-order constraints\. On the heterophilic Chameleon dataset, JacobiConv converges to a monotonic low\-pass profile under theK=2K=2setting, limiting its ability to emphasize higher\-frequency bands\. Similarly, BernNet learns an approximately flat response on the Actor dataset\. While such a flat profile is consistent with empirical observations that structure\-agnostic MLP\-style models can perform competitively on certain heterophilic graphs, it indicates limited spectral differentiation at low polynomial order\.
In contrast, DCQ\-GNN decomposes the spectrum into complementary frequency channels rather than fitting a single global polynomial\. This channel\-wise decomposition enables the model to combine frequency\-specific responses through node\-adaptive gating, facilitating flexible spectral discrimination without increasing polynomial degree\.
### E\.7Node Classification with Dense Split \(Balanced Classes, RQ6\)
Table 10:Node classification accuracy \(mean±\\pmstandard deviation\) on four benchmark datasets under the dense split setting\. The best result is highlighted inboldand the second best isunderlined\.In the main paper \(Table[2](https://arxiv.org/html/2606.24956#S3.T2)\), models are evaluated under SRS splits without enforcing class balance\. In contrast, Table[10](https://arxiv.org/html/2606.24956#A5.T10)reports results under the dense split protocol\.
To provide a complementary evaluation and enable comparison with recent studies, we additionally evaluate DCQ\-GNN under a class\-balanced “dense split” protocol\. In this setting, the training set allocates an equal number of nodes to each class \(specifically,Ntrain=N×train\_ratio/CN\_\{train\}=N\\times train\\\_ratio/Cnodes per class\)\. Although this protocol modifies the inherent class imbalance, it is widely adopted in contemporary spectral GNN literature\.
For completeness, we include two recent baselines in this evaluation: AutoSGNN and LHK\-GNN\. As reported in Table[10](https://arxiv.org/html/2606.24956#A5.T10), DCQ\-GNN achieves the best overall average rank \(2\.50\) across the evaluated datasets and outperforms AutoSGNN and LHK\-GNN on average\. Under this protocol, DCQ\-GNN achieves competitive performance across all datasets and obtains the best average rank\. Overall, these results indicate that the proposed convex–concave quadratic filter bank maintains competitive performance under both naturally imbalanced and class\-balanced training settings\.
## Appendix FRelated Work
We review prior work most relevant to DCQ\-GNN from three perspectives: spectral graph convolution, adaptive polynomial spectral filtering, and filter\-bank mechanisms for learning on heterophilic graphs\.
### F\.1Spectral Graph Convolutional Networks
Spectral graph neural networks originate from graph signal processing, which extends classical signal filtering to irregular graph domains\[[27](https://arxiv.org/html/2606.24956#bib.bib27)\]\. Early formulations defined convolution in the graph Laplacian eigenbasis\[[6](https://arxiv.org/html/2606.24956#bib.bib32)\], but their reliance on eigendecomposition limited scalability\. ChebyNet\[[10](https://arxiv.org/html/2606.24956#bib.bib1)\]alleviates this issue by approximating spectral filters using Chebyshev polynomials, enabling localized and efficient propagation without explicit eigenvectors\. GCN\[[15](https://arxiv.org/html/2606.24956#bib.bib11)\]further simplifies this formulation to a first\-order polynomial approximation, resulting in an efficient and widely adopted low\-pass operator\.
Many spatial GNN architectures can be interpreted within this spectral framework and exhibit an inherent low\-frequency bias, including GraphSAGE\[[11](https://arxiv.org/html/2606.24956#bib.bib26)\], GAT\[[28](https://arxiv.org/html/2606.24956#bib.bib15)\], JK\-Net\[[31](https://arxiv.org/html/2606.24956#bib.bib16)\], and GCNII\[[8](https://arxiv.org/html/2606.24956#bib.bib17)\]\. While such low\-pass smoothing behavior is well suited to homophilic graphs, it is often suboptimal in heterophilic settings, where adjacent nodes tend to have dissimilar labels and discriminative information may reside in higher\-frequency components\[[3](https://arxiv.org/html/2606.24956#bib.bib13),[35](https://arxiv.org/html/2606.24956#bib.bib25)\]\.
### F\.2Adaptive Polynomial Spectral Filters
To overcome the limitations of fixed low\-pass filtering, several approaches have explored adaptive polynomial spectral responses\. GPR\-GNN\[[9](https://arxiv.org/html/2606.24956#bib.bib18)\]learns propagation coefficients inspired by generalized PageRank\[[5](https://arxiv.org/html/2606.24956#bib.bib35)\], allowing data\-dependent aggregation across multiple hop distances\. BernNet\[[12](https://arxiv.org/html/2606.24956#bib.bib19)\]adopts Bernstein polynomial bases with constrained coefficients to improve numerical stability while supporting diverse spectral profiles\. JacobiConv\[[29](https://arxiv.org/html/2606.24956#bib.bib22)\]employs orthogonal Jacobi polynomials to mitigate instability in higher\-order expansions, and UniFilter\[[14](https://arxiv.org/html/2606.24956#bib.bib9)\]further enhances adaptivity by dynamically constructing spectral bases tailored to graph heterophily\.
Although these methods significantly increase spectral flexibility, achieving sharper frequency selectivity typically requires higher polynomial orders\. This may enlarge the hypothesis space, increase sensitivity to noise, and incur additional computational cost due to repeated propagation steps\[[17](https://arxiv.org/html/2606.24956#bib.bib14)\]\. In contrast, our work investigates whether a strictly second\-order design can capture much of the practical benefit of spectral adaptivity, while offering improved stability and parameter efficiency\.
### F\.3Filter Banks for Heterophilic Graphs
A complementary line of research addresses heterophily by decomposing graph signals into multiple frequency components and combining them through learned mixing mechanisms\. FAGCN\[[4](https://arxiv.org/html/2606.24956#bib.bib23)\]integrates low\- and high\-frequency information via frequency\-adaptive gating, while related multi\-channel formulations have been explored in AutoGCN\[[30](https://arxiv.org/html/2606.24956#bib.bib34)\]and Adaptive Channel Mixing \(ACM\)\[[18](https://arxiv.org/html/2606.24956#bib.bib24)\]\. More recently, Zheng et al\.\[[34](https://arxiv.org/html/2606.24956#bib.bib3)\]introduced spectral tight framelet filter banks composed of one low\-pass and multiple high\-pass filters, enabling principled frequency separation with stable reconstruction guarantees\.
Despite their effectiveness, most existing filter\-bank approaches rely on linear filters as their fundamental building blocks\. From a spectral viewpoint, repeated linear low\-pass propagation exhibits no intrinsic curvature and therefore provides limited capability for suppressing structurally induced interference\. Consequently, perturbations or corrupted components may be repeatedly propagated and even amplified\[[13](https://arxiv.org/html/2606.24956#bib.bib37),[36](https://arxiv.org/html/2606.24956#bib.bib36)\]\. In contrast, DCQ\-GNN employs quadratic filters with an explicit convex–concave decomposition, thereby introducing controllable curvature into the spectral response\. As shown in our theoretical analysis, this curvature enables selective attenuation of interference\-prone frequency components while preserving informative signal bands\. Consequently, DCQ\-GNN realizes a structured filter\-bank architecture that yields sharper spectral discrimination and improved robustness, without relying on high\-order polynomial expansions or sacrificing the computational efficiency of low\-order message passing\.
Several studies have explored multi\-channel filtering to address graph heterophily\. Although Graph Framelets provide powerful multi\-scale frequency decompositions, they typically require complex frame construction and high\-order spectral expansions\. In contrast, DCQ\-GNN provides a lightweight quadratic alternative: by exploiting curvature polarity rather than multi\-level scaling, the proposed model achieves comparable spectral separation with a strictly two\-hop computational footprint\.
### F\.4Connections to Continuous Diffusion and Graph Transformers
##### Continuous Graph Diffusion\.
GRAND\[[7](https://arxiv.org/html/2606.24956#bib.bib40)\]and GraphCON\[[25](https://arxiv.org/html/2606.24956#bib.bib41)\]define node dynamics via neural ODEs on graphs, with stability certified through Lyapunov analysis of the continuous flow\. The second\-order filterg\(λ\)=a0\+a1λ\+a2λ2g\(\\lambda\)=a\_\{0\}\+a\_\{1\}\\lambda\+a\_\{2\}\\lambda^\{2\}admits a discrete\-time interpretation as a two\-step Runge–Kutta approximation of such diffusion:a1a\_\{1\}governs the diffusion rate, whilea2a\_\{2\}introduces a curvature correction analogous to the momentum coupling in GraphCON\. Theorem[5](https://arxiv.org/html/2606.24956#Thmtheorem5)can thus be viewed as the spectral counterpart of those Lyapunov certificates\. Both characterize bounded output variation under bounded structural perturbation, whereas our result operates on filter coefficients rather than ODE parameters\.
##### Graph Transformers\.
Transformer\-based architectures such as Graphormer\[[33](https://arxiv.org/html/2606.24956#bib.bib42)\]and GPS\[[23](https://arxiv.org/html/2606.24956#bib.bib43)\]attend globally over node pairs, achieving strong performance at the cost of𝒪\(N2\)\\mathcal\{O\}\(N^\{2\}\)attention complexity or explicit structural positional encodings\. DCQ\-GNN targets a complementary regime in which𝒪\(M\)\\mathcal\{O\}\(M\)polynomial propagation is preferred for efficiency and spectral interpretability\. We therefore view Graph Transformers as a distinct design family rather than direct competitors\.Similar Articles
Hierarchical Multi-Scale Graph Neural Networks: Scalable Heterophilous Learning with Oversmoothing and Oversquashing Mitigation
This paper introduces HMH, a hierarchical multi-scale Graph Neural Network framework designed to address oversmoothing and oversquashing in heterophilous graphs. It utilizes spectral filters with Haar bases to achieve scalable learning and improved performance on node and graph classification tasks.
Enhanced Graph Neural Networks using K-Hop Gaussian Diffusion
This paper proposes a K-Hop Gaussian (KHG) diffusion kernel as a preprocessing module for graph neural networks, balancing local and global information propagation to mitigate over-smoothing and information bottlenecks. Experiments show significant improvements over traditional message-passing GNNs and existing diffusion kernels, especially on noisy or structurally complex graphs.
Modeling Heterophily in Multiplex Graphs: An Adaptive Approach for Node Classification
This paper introduces HAAM, a novel method for node classification in multiplex graphs that adapts to both homophilic and heterophilic interactions across dimensions. It uses dimension-specific compatibility matrices and a product of trainable low-pass and high-pass filters approximated via Chebyshev polynomials to capture smooth and abrupt signal changes.
Provably Communication-Efficient and Privacy-Preserving Federated Graph Neural Networks
This paper proposes CE-FedGNN, a federated graph neural network framework that achieves communication efficiency and privacy preservation by infrequently exchanging aggregated node representations with metric differential privacy guarantees, and demonstrates strong performance on benchmarks.
Spectral Gradient Surgery for Domain-Generalizable Dataset Distillation
This paper introduces Domain Generalizable Dataset Distillation (DGDD), a new problem setting that targets out-of-distribution generalization of distilled datasets, and proposes Spectral Gradient Surgery (SGS) to disentangle class-discriminative and domain-specific information by leveraging cross-domain gradient agreement in the spectral domain.