Curvature-Informed Potential Energy Surface for Protein-Ligand Binding Affinity Prediction
Summary
This paper proposes CPES, a curvature-informed potential energy surface graph neural network for protein-ligand binding affinity prediction. It integrates physics-informed curvature representations to model conformational flexibility and achieves improved predictive performance on benchmark datasets.
View Cached Full Text
Cached at: 06/15/26, 09:12 AM
# Curvature-Informed Potential Energy Surface for Protein-Ligand Binding Affinity Prediction
Source: [https://arxiv.org/html/2606.14217](https://arxiv.org/html/2606.14217)
Chuan\-Xian Ren∗Hong YanThis work is supported in part by National Key R&D Program of China \(2024YFA1011900\), National Natural Science Foundation of China \(62376291\), Guangdong Basic and Applied Basic Research Foundation \(2023B1515020004\), Science and Technology Program of Guangzhou \(2024A04J6413\), and in part by the Hong Kong Innovation and Technology Commission \(ITC\) \(InnoHK Project CIMDA\) and the Institute of Digital Medicine of City University of Hong Kong \(Project 9229503\)\.\(Corresponding author: Chuan\-Xian Ren\.\)P\.F\. Sun and C\.X\. Ren are with the School of Mathematics, Sun Yat\-Sen University, Guangzhou, Guangdong 510275, China \(e\-mail: rchuanx@mail\.sysu\.edu\.cn\)\. H\. Yan is with the Department of Electrical Engineering, City University of Hong Kong, Hong Kong\.The code are available at https://github\.com/Peng\-Fei\-Sun/CPES\.
###### Abstract
Accurate prediction of protein\-ligand binding affinity is essential for structure\-based drug discovery\. Recent geometric deep learning methods have achieved promising performance by representing protein\-ligand complexes as three\-dimensional graphs\. However, most existing approaches mainly rely on static interaction geometry from a single bound conformation, while neglecting molecular flexibility and binding\-induced conformational changes\. To address this limitation, we propose a curvature\-informed potential energy surface \(CPES\) graph neural network for protein\-ligand binding affinity prediction, which incorporates physics\-informed curvature representations to model conformational flexibility\. CPES first derives curvature spectral descriptors from the Hessian of the potential energy surface evaluated at equilibrium configurations, whose eigenvalues define the local principal curvatures of the potential energy surface\. It then uses spectral cross\-attention to compare the unbound ligand and protein with the bound complex, thereby capturing binding\-induced changes in conformational dynamics\. In parallel, hierarchical protein\-ligand interaction representations are learned from static structural features through geometry\-aware message passing, soft clustering, and bidirectional cross\-attention\. Finally, CPES fuses the curvature\-informed dynamic representations with static interaction representations for affinity regression\. Extensive evaluations on multiple benchmark datasets demonstrate that CPES achieves improved predictive performance and offers physical interpretability\.
\{IEEEkeywords\}
protein\-ligand binding affinity, graph neural networks, inductive bias, structural dynamics, cross\-attention\.
## 1Introduction
Figure 1:Overview of the CPES framework\. CPES integrates curvature\-informed dynamic modeling with geometry\-aware structural learning\. Eigenmodes characterize conformational flexibility and collective motions \(small red arrows indicate molecular motion trends\)\. Cross\-attention is applied in both dynamic and static branches to capture binding\-related interactions\. By combining dynamic and structural information, the framework provides a physically meaningful representation for protein\-ligand binding affinity prediction\.\\IEEEPARstart
Accurate estimation of protein\-ligand binding affinity, which reflects the strength of interaction between a biomolecular target and a small\-molecule ligand, constitutes a cornerstone of structure\-based drug discovery\[[21](https://arxiv.org/html/2606.14217#bib.bib2),[6](https://arxiv.org/html/2606.14217#bib.bib3),[25](https://arxiv.org/html/2606.14217#bib.bib4),[20](https://arxiv.org/html/2606.14217#bib.bib5)\]\. This quantity underlies key stages of the discovery pipeline, including compound prioritization\[[12](https://arxiv.org/html/2606.14217#bib.bib6)\], lead refinement\[[28](https://arxiv.org/html/2606.14217#bib.bib7)\], and rational drug design\[[10](https://arxiv.org/html/2606.14217#bib.bib8)\], and is commonly quantified using dissociation or inhibition constants such asKdK\_\{d\}andKiK\_\{i\}\. While experimental techniques offer reliable affinity measurements, their high cost, limited throughput, and substantial time requirements hinder large\-scale exploration of chemical space\. Thus, computational modeling has become an essential module of modern drug discovery, enabling efficient screening and ranking of candidate compounds\[[18](https://arxiv.org/html/2606.14217#bib.bib9)\]\. The increasing availability of experimentally resolved protein\-ligand complex structures, together with curated affinity annotations, has further stimulated the adoption of machine learning approaches for data\-driven affinity prediction\[[22](https://arxiv.org/html/2606.14217#bib.bib10),[26](https://arxiv.org/html/2606.14217#bib.bib11)\]\. Despite these advances, accurately capturing protein\-ligand binding affinity remains a challenging task, owing to the intricate and context\-dependent nature of molecular interactions, which continues to limit the effectiveness of existing learning\-based models\[[4](https://arxiv.org/html/2606.14217#bib.bib12)\]\.
Recent advances in machine learning, particularly geometric deep learning, have substantially improved protein\-ligand binding affinity prediction\[[24](https://arxiv.org/html/2606.14217#bib.bib13),[27](https://arxiv.org/html/2606.14217#bib.bib14),[1](https://arxiv.org/html/2606.14217#bib.bib15)\]\. Existing methods can be broadly categorized into interaction\-free and interaction\-based approaches\[[24](https://arxiv.org/html/2606.14217#bib.bib13)\]\. Interaction\-free methods\[[16](https://arxiv.org/html/2606.14217#bib.bib16),[15](https://arxiv.org/html/2606.14217#bib.bib17),[31](https://arxiv.org/html/2606.14217#bib.bib18)\]represent ligands using fingerprints, SMILES, or 2D graphs and model proteins as sequences, intentionally omitting explicit atomic interactions for simplicity\. In contrast, interaction\-based models\[[5](https://arxiv.org/html/2606.14217#bib.bib19),[19](https://arxiv.org/html/2606.14217#bib.bib20),[14](https://arxiv.org/html/2606.14217#bib.bib21),[7](https://arxiv.org/html/2606.14217#bib.bib22),[17](https://arxiv.org/html/2606.14217#bib.bib23),[32](https://arxiv.org/html/2606.14217#bib.bib24),[9](https://arxiv.org/html/2606.14217#bib.bib25),[23](https://arxiv.org/html/2606.14217#bib.bib26),[3](https://arxiv.org/html/2606.14217#bib.bib27),[30](https://arxiv.org/html/2606.14217#bib.bib28),[13](https://arxiv.org/html/2606.14217#bib.bib29)\]represent protein\-ligand complexes as three\-dimensional graphs or grids and explicitly model atomic interactions and spatial relationships\. By introducing geometric inductive biases that are more closely aligned with physical binding mechanisms, these interaction\-based approaches have demonstrated superior predictive performance compared with sequence\-based and handcrafted descriptor methods, and have become a dominant paradigm for protein\-ligand interaction modeling, particularly when implemented using interaction graph neural networks\. A more detailed discussion of related works and additional references is provided in Supplement[1](https://arxiv.org/html/2606.14217#S1a)\.
Despite their success, most existing graph\-based approaches, including state\-of\-the\-art interaction\-based methods, still implicitly treat protein\-ligand complexes as rigid entities and rely primarily on static geometric features extracted from a single bound conformation\. Such assumptions overlook the inherently dynamic nature of molecular binding, which occurs over ensembles of conformations rather than fixed structures\. Moreover, most methods do not explicitly account for the discrepancy in conformational dynamics between the apo \(unbound\) and holo \(bound\) states of protein\-ligand complexes\. From a biophysical perspective, protein\-ligand binding is governed by the underlying potential energy surface, which governs the relative stability and conformational flexibility of apo and holo states\. Although explicitly computing the full potential energy surface is computationally infeasible at scale, its local structure encodes essential information about molecular stiffness and flexibility that can be captured through curvature\.
Motivated by this observation, we incorporate descriptors inspired by such local conformational energy variations as curvature\-informed priors within geometric representation learning and propose a curvature\-informed potential energy surface \(CPES\) graph neural network for protein\-ligand binding affinity prediction\. By integrating these priors into a hierarchical protein\-ligand graph framework, CPES captures molecular flexibility and dynamic compatibility alongside interaction geometry, capturing both protein\-ligand interactions and binding\-induced conformational changes\. In summary, the main contributions of our work are as follows\.
- •A physical energy curvature\-informed inductive bias is introduced for protein\-ligand binding affinity prediction\. We formulate the binding affinity prediction from a physics\-informed perspective and argue that incorporating energy\-related geometric information reflecting physical and biological binding mechanisms is critical for generalization and interpretability\.
- •A curvature\-informed potential energy surface \(CPES\) graph neural network is proposed for geometric representation learning\. CPES integrates curvature\-derived descriptors, defined as the eigenvalues of the Hessian of the potential energy surface at equilibrium \(i\.e\., the principal curvatures\), enabling the model to capture molecular flexibility and dynamic compatibility beyond static interaction geometry\.
- •CPES models the discrepancy between apo \(unbound\) and holo \(bound\) conformational dynamics via cross\-attention over curvature\-informed representations, capturing binding\-induced conformational changes and achieving improved predictive performance and more biologically meaningful modeling across diverse benchmarks\.
## 2Curvature of the Potential Energy Surface
The stability and flexibility of protein\-ligand systems in configuration space are fundamentally governed by the local curvature of the potential energy surface near equilibrium\. The potential energy function defines how the system’s energy varies with molecular conformation, and under physiological conditions, conformational fluctuations are typically confined to small deviations around an equilibrium or metastable state\. Consequently, the local shape of the potential energy surface in this region provides a natural physical description of conformational behavior\.
Consider a system composed ofNNinteraction sites, whose configuration is described by the Cartesian coordinate vector
𝐪∈ℝ3N,\\mathbf\{q\}\\in\{\{\\mathbb\{R\}\}^\{3N\}\},where each triplet corresponds to the three\-dimensional position of a site\. Let𝐪0\\mathbf\{q\}\_\{0\}denote an equilibrium conformation\. In the vicinity of𝐪0\\mathbf\{q\}\_\{0\}, the potential energy functionV\(𝐪\)V\(\\mathbf\{q\}\)can be expanded as
V\(𝐪\)=V\(𝐪0\)\+∇V\(𝐪0\)⊤\(𝐪−𝐪0\)\+12\(𝐪−𝐪0\)⊤𝐇\(𝐪−𝐪0\)\+o\(‖𝐪−𝐪0‖2\),\\begin\{split\}V\(\\mathbf\{q\}\)=\\;&V\(\\mathbf\{q\}\_\{0\}\)\+\\nabla V\(\\mathbf\{q\}\_\{0\}\)^\{\\top\}\(\\mathbf\{q\}\-\\mathbf\{q\}\_\{0\}\)\\\\ &\+\\frac\{1\}\{2\}\(\\mathbf\{q\}\-\\mathbf\{q\}\_\{0\}\)^\{\\top\}\\mathbf\{H\}\(\\mathbf\{q\}\-\\mathbf\{q\}\_\{0\}\)\+o\\\!\\left\(\\\|\\mathbf\{q\}\-\\mathbf\{q\}\_\{0\}\\\|^\{2\}\\right\),\\end\{split\}where∇V\(𝐪0\)\\nabla V\(\\mathbf\{q\}\_\{0\}\)is the first derivative of the potential energy, and𝐇∈ℝ3N×3N\\mathbf\{H\}\\in\{\{\\mathbb\{R\}\}^\{3N\\times 3N\}\}is the Hessian matrix with elements
Hij=∂2V∂qi∂qj\|𝐪=𝐪0\.\{\{H\}\_\{ij\}\}=\{\{\\left\.\\frac\{\{\{\\partial\}^\{2\}\}V\}\{\\partial\{\{q\}\_\{i\}\}\\partial\{\{q\}\_\{j\}\}\}\\right\|\}\_\{\\mathbf\{q\}=\{\{\\mathbf\{q\}\}\_\{0\}\}\}\}\.At equilibrium, the net force acting on the system vanishes, implying∇V\(𝐪0\)=𝟎\\nabla V\(\{\{\\mathbf\{q\}\}\_\{0\}\}\)=\\mathbf\{0\}\. Under this condition, the local shape of the potential energy surface is fully determined by the second\-order term, and the Hessian provides a complete description of local curvature\.
Because the Hessian is a real symmetric matrix constructed from second derivatives of the potential energy, it admits an orthogonal spectral decomposition
𝐇=𝚽𝚲𝚽⊤,𝚲=diag\(λ1,λ2,…,λ3N\),\\mathbf\{H\}=\\mathbf\{\\Phi\}\\,\\mathbf\{\\Lambda\}\\,\\mathbf\{\\Phi\}^\{\\top\},\\qquad\\mathbf\{\\Lambda\}=\\operatorname\{diag\}\(\\lambda\_\{1\},\\lambda\_\{2\},\\ldots,\\lambda\_\{3N\}\),where the columns of𝚽\\mathbf\{\\Phi\}define orthonormal mode directions\. In this eigenbasis, the second\-order variation of the potential energy is fully decoupled, such that each eigenvalueλk\\lambda\_\{k\}directly quantifies the local curvature of the energy surface along orthogonal directions, which can be interpreted as principal curvatures in a generalized high\-dimensional setting\. At a local minimum of the potential energy, all non\-zero eigenvalues are positive, each eigenvalue corresponds to the second\-order variation of the potential energy along its associated eigenvector and therefore quantifies the local curvature in that direction\. Eigenvalues that are exactly zero correspond to motions that do not change the internal potential energy\. For three\-dimensional molecular systems, the Hessian therefore typically exhibits at least six zero eigenvalues, associated with rigid\-body translations and rotations\.
In summary, the spectral structure of the Hessian provides a compact and physically interpretable characterization of the local curvature of the potential energy surface\. Low\-curvature modes dominate conformational changes because displacements along these directions are energetically inexpensive, allowing large\-scale collective rearrangements, whereas high\-curvature modes strongly resist deformation\. This curvature\-based perspective motivates the use of low\-curvature spectral information to describe global flexibility and collective conformational behavior in subsequent modeling\.
SE\(3\) Invariance of the Hessian Eigenvalues\.To ensure that curvature\-based descriptors are independent of the global coordinate frame, we note that the eigenvalue spectrum of the Hessian is invariant under rigid\-body transformations in SE\(3\)\. Because the potential energy depends only on internal geometric relationships, it is invariant under global rigid\-body transformations\. In particular, global translations do not alter the Hessian matrix, as second derivatives of the potential energy are unaffected by constant shifts of the coordinates\. Global rotations correspond to orthogonal changes of basis in the 3N\-dimensional configuration space, under which the Hessian undergoes an orthogonal similarity transformation\. Such transformations preserve the eigenvalue spectrum, implying that Hessian eigenvalues are invariant under SE\(3\) transformations, while the corresponding eigenvectors rotate with the coordinate frame\. This property can be formally stated as the following proposition\.
Proposition \(SE\(3\) equivariance of the Hessian\)\.A potential energy function and its Hessian matrix are given as follows
V\(𝐪\):\\displaystyle V\(\\mathbf\{q\}\)\\;:ℝ3N→ℝ,\\displaystyle\\;\\mathbb\{R\}^\{3N\}\\to\\mathbb\{R\},\(1\)𝐇\(𝐪\)=\\displaystyle\\mathbf\{H\}\(\\mathbf\{q\}\)\\;=∇2V\(𝐪\)=∂2V\(𝐪\)∂𝐪∂𝐪⊤∈ℝ3N×3N\.\\displaystyle\\;\\nabla^\{2\}V\(\\mathbf\{q\}\)=\\frac\{\\partial^\{2\}V\(\\mathbf\{q\}\)\}\{\\partial\\mathbf\{q\}\\,\\partial\\mathbf\{q\}^\{\\top\}\}\\in\\mathbb\{R\}^\{3N\\times 3N\}\.The potential energy function depends only on the internal geometric relations of the system and is invariant under global rigid\-body transformations\. Specifically, for any rotation and any translation𝐑∈SO\(3\),𝐭∈ℝ3,\(𝐑,𝐭\)∈SE\(3\)\\mathbf\{R\}\\in\\mathrm\{SO\}\(3\),\\;\\mathbf\{t\}\\in\\mathbb\{R\}^\{3\},\\;\(\\mathbf\{R\},\\mathbf\{t\}\)\\in\\mathrm\{SE\}\(3\)define
𝐐\\displaystyle\\mathbf\{Q\}=diag\(𝐑,…,𝐑⏟N\)∈ℝ3N×3N,\\displaystyle=\\operatorname\{diag\}\(\\underbrace\{\\mathbf\{R\},\\ldots,\\mathbf\{R\}\}\_\{N\}\)\\in\\mathbb\{R\}^\{3N\\times 3N\},𝐜\\displaystyle\\mathbf\{c\}=\(𝐭,…,𝐭⏟N\)∈ℝ3N\.\\displaystyle=\(\\underbrace\{\\mathbf\{t\},\\ldots,\\mathbf\{t\}\}\_\{N\}\)\\in\\mathbb\{R\}^\{3N\}\.The SE\(3\)\-transformed configuration is given by
𝐪′=𝐐𝐪\+𝐜\.\\mathbf\{q\}^\{\\prime\}=\\mathbf\{Q\}\\,\\mathbf\{q\}\+\\mathbf\{c\}\.Then, under an SE\(3\) transformation, the potential energy functionV′\(𝐪′\)V^\{\\prime\}\(\\mathbf\{q\}^\{\\prime\}\)and its Hessian matrix𝐇′\(𝐪′\)\\mathbf\{H\}^\{\\prime\}\(\\mathbf\{q\}^\{\\prime\}\)satisfy
V′\(𝐪′\)≡V\(𝐪\),𝐇′\(𝐪′\)=𝐐𝐇\(𝐪\)𝐐⊤\.V^\{\\prime\}\(\\mathbf\{q\}^\{\\prime\}\)\\equiv V\(\\mathbf\{q\}\),\\qquad\\mathbf\{H\}^\{\\prime\}\(\\mathbf\{q\}^\{\\prime\}\)=\\mathbf\{Q\}\\,\\mathbf\{H\}\(\\mathbf\{q\}\)\\,\\mathbf\{Q\}^\{\\top\}\.\(2\)The equalityV′\(𝐪′\)≡V\(𝐪\)V^\{\\prime\}\(\\mathbf\{q\}^\{\\prime\}\)\\equiv V\(\\mathbf\{q\}\)follows directly from the physical invariance of the potential energy under global rigid\-body transformations\. The corresponding relation for the Hessian matrix𝐇′\(𝐪′\)=𝐐𝐇\(𝐪\)𝐐⊤\\mathbf\{H\}^\{\\prime\}\(\\mathbf\{q\}^\{\\prime\}\)=\\mathbf\{Q\}\\,\\mathbf\{H\}\(\\mathbf\{q\}\)\\,\\mathbf\{Q\}^\{\\top\}is a mathematical consequence of this invariance and can be rigorously derived via the chain rule under coordinate reparameterization \(detailed derivation is provided in Supplement[2](https://arxiv.org/html/2606.14217#S2a)\)\.
Since𝐐\\mathbf\{Q\}is an orthogonal matrix satisfying𝐐⊤𝐐=𝐈\\mathbf\{Q\}^\{\\top\}\\mathbf\{Q\}=\\mathbf\{I\}, the relation between the Hessian matrices before and after the SE\(3\) transformation constitutes an orthogonal similarity transformation\. Consequently, the eigenvalues of the Hessian are invariant, while the eigenvectors transform covariantly with the coordinate system, i\.e\.,
λk′=λk,𝝋k′=𝐐𝝋k,k=1,…,3N−6,\\lambda\_\{k\}^\{\\prime\}=\\lambda\_\{k\},\\qquad\\boldsymbol\{\\varphi\}\_\{k\}^\{\\prime\}=\\mathbf\{Q\}\\,\\boldsymbol\{\\varphi\}\_\{k\},\\quad k=1,\\ldots,3N\-6,\(3\)wherekkindexes the normal modes of the Hessian matrix excluding the rigid\-body zero modes\.
## 3Methods
### 3\.1Descriptors of Potential Energy Function Curvature
To obtain a computable approximation of the second\-order geometry of the potential energy surface, we adopt the anisotropic network model \(ANM\) to represent the potential energy function in the vicinity of a reference configuration\. The core assumptions of ANM are encoded in the functional form of the potential energy: near the reference configuration the potential energy depends solely on inter\-node distance fluctuations and is approximated by an anisotropic harmonic form
V\(𝐪\)=12∑\(i,j\)∈ℰ\(∥𝐫ij∥−∥𝐫ij0∥\)2\.V\(\\mathbf\{q\}\)=\\frac\{1\}\{2\}\\sum\_\{\(i,j\)\\in\\mathcal\{E\}\}\\left\(\\lVert\\mathbf\{r\}\_\{ij\}\\rVert\-\\lVert\\mathbf\{r\}\_\{ij\}^\{0\}\\rVert\\right\)^\{2\}\.Here,𝐫i∈ℝ3\{\{\\mathbf\{r\}\}\_\{i\}\}\\in\{\{\\mathbb\{R\}\}^\{3\}\}denotes the three\-dimensional coordinate vector of nodeii,𝐫ij=𝐫j−𝐫i\{\{\\mathbf\{r\}\}\_\{ij\}\}=\{\{\\mathbf\{r\}\}\_\{j\}\}\-\{\{\\mathbf\{r\}\}\_\{i\}\}denotes the relative displacement vector between nodesiiandjjin the current configuration,𝐫ij0=𝐫j0−𝐫i0\\mathbf\{r\}\_\{ij\}^\{0\}=\\mathbf\{r\}\_\{j\}^\{0\}\-\\mathbf\{r\}\_\{i\}^\{0\}denotes the corresponding relative displacement vector in the reference configuration, andℰ\\mathcal\{E\}is the set of undirected interacting node pairs\. This formulation implicitly assumes small\-amplitude fluctuations compatible with the harmonic approximation and, critically, that each interaction constrains motion only along its reference direction while transverse displacements do not contribute to the restoring force\.
For each interacting pairiiandjj, a unit direction vector is defined as
𝐞ij=𝐫ij0∥𝐫ij0∥,\\mathbf\{e\}\_\{ij\}=\\frac\{\\mathbf\{r\}\_\{ij\}^\{0\}\}\{\\lVert\\mathbf\{r\}\_\{ij\}^\{0\}\\rVert\},which specifies the geometric orientation of the interaction in three\-dimensional space\. Since the potential energy is sensitive only to distance changes along this direction, the second\-order contribution of the interaction naturally takes the form of a projection onto𝐞ij\\mathbf\{e\}\_\{ij\}, leading to the local stiffness matrix which characterizes the resistance of a local molecular region to small structural perturbations, i\.e\.,
𝐊ij=𝐞ij𝐞ij⊤∈ℝ3×3\.\\mathbf\{K\}\_\{ij\}=\\mathbf\{e\}\_\{ij\}\\,\\mathbf\{e\}\_\{ij\}^\{\\top\}\\in\\mathbb\{R\}^\{3\\times 3\}\.This rank\-one matrix penalizes only displacement components aligned with the interaction direction while leaving transverse motions unconstrained, thereby providing a direct mathematical realization of the anisotropic assumption\. The global Hessian is assembled from these3×33\\times 3stiffness blocks𝐊ij\\mathbf\{K\}\_\{ij\}in an edge\-wise manner\. For each undirected interacting node pairs\(i,j\)∈ℰ\(i,j\)\\in\\mathcal\{E\}, the following block updates are applied
𝐇ii\+=𝐊ij,𝐇jj\+=𝐊ij,\\displaystyle\\mathbf\{H\}\_\{ii\}\\mathrel\{\+\}=\\mathbf\{K\}\_\{ij\},\\quad\\mathbf\{H\}\_\{jj\}\\mathrel\{\+\}=\\mathbf\{K\}\_\{ij\},\(4\)𝐇ij\-=𝐊ij,𝐇ji\-=𝐊ij,\\displaystyle\\mathbf\{H\}\_\{ij\}\\mathrel\{\-\}=\\mathbf\{K\}\_\{ij\},\\quad\\mathbf\{H\}\_\{ji\}\\mathrel\{\-\}=\\mathbf\{K\}\_\{ij\},where𝐇ij∈ℝ3×3\\mathbf\{H\}\_\{ij\}\\in\\mathbb\{R\}^\{3\\times 3\}denotes the\(i,j\)\(i,j\)\-th3×33\\times 3sub\-block of the Hessian matrix\. This construction corresponds to expanding the quadratic energy contribution of each interaction and, by construction, ensures symmetry, positive semidefiniteness, and invariance of the potential energy under rigid\-body translations\. From this edge\-wise assembly, the characteristic block structure of the Hessian becomes evident: off\-diagonal blocks encode pairwise couplings, while diagonal blocks accumulate contributions from all interactions incident to each node\. Accordingly, the assembled Hessian admits the following piecewise representation
𝐇ij=\{−𝐊ij,i≠j,\(i,j\)∈ℰ,∑k:\(i,k\)∈ℰ𝐊ik,i=j,𝟎,otherwise\.\\mathbf\{H\}\_\{ij\}=\\begin\{cases\}\-\\mathbf\{K\}\_\{ij\},&i\\neq j,\\ \(i,j\)\\in\\mathcal\{E\},\\\\ \\displaystyle\\sum\_\{k:\(i,k\)\\in\\mathcal\{E\}\}\\mathbf\{K\}\_\{ik\},&i=j,\\\\ \\mathbf\{0\},&\\text\{otherwise\}\.\\end\{cases\}\(5\)
In the construction of Eqs\. \([4](https://arxiv.org/html/2606.14217#S3.E4)\) and \([5](https://arxiv.org/html/2606.14217#S3.E5)\), the Hessian matrix𝐇\\mathbf\{H\}exhibits a typical graph\-Laplacian\-like block structure\. The diagonal blocks are given by the summation of all neighboring edge contributions, whereas the off\-diagonal blocks are given by the negative contribution of the corresponding edge\. This form generalizes the classical scalar graph Laplacian𝐋=𝐃−𝐀\\mathbf\{L\}=\\mathbf\{D\}\-\\mathbf\{A\}\(where𝐃\\mathbf\{D\}denotes the degree matrix and𝐀\\mathbf\{A\}denotes the adjacency matrix\) to a vector\-valued graph Laplacian, where each edge\(i,j\)\(i,j\)is assigned a3×33\\times 3anisotropic stiffness matrix𝐊ij\\mathbf\{K\}\_\{ij\}rather than a single scalar weight\. Therefore,𝐇\\mathbf\{H\}can be viewed as a second\-order elliptic discrete operator defined on the space of node displacement vectors, and its quadratic form directly corresponds to the elastic potential energy of the system around the reference conformation\. This perspective unifies the stiffness matrix in mechanics within the framework of spectral graph theory, providing a natural algebraic foundation for the subsequent eigendecomposition and low\-frequency spectral feature extraction\.
Following Hessian construction, symmetric eigendecomposition is performed
𝐇𝝋k=λk𝝋k,\\mathbf\{H\}\\,\\boldsymbol\{\\varphi\}\_\{k\}=\\lambda\_\{k\}\\boldsymbol\{\\varphi\}\_\{k\},where the eigenvaluesλk\\lambda\_\{k\}quantify the local curvature of the potential energy surface along orthogonal normal modes𝝋k\\boldsymbol\{\\varphi\}\_\{k\}\. To remove rigid\-body and near\-zero modes, a relative cutoff is applied, retaining only modes satisfying
λk\>ε,ε=εrelmaxk\|λk\|\.\\lambda\_\{k\}\>\\varepsilon,\\qquad\\varepsilon=\\varepsilon\_\{\\mathrm\{rel\}\}\\max\_\{k\}\\lvert\\lambda\_\{k\}\\rvert\.Among the remaining modes \(we applyεrel=1e\\varepsilon\_\{\\mathrm\{rel\}\}=1e\-6\), the smallestKuK\_\{\\mathrm\{u\}\}eigenvalues are selected as global descriptors of potential energy function curvature\. Under the harmonic approximation, the eigendecomposition of the Hessian reveals that low\-frequency eigenvalues correspond to the flattest directions of the potential energy surface and thus dominate global flexibility and conformational accessibility\. A logarithmic transform with standard numerical stabilization yields the graph\-level curvature feature
𝐮=\[lnλ1,lnλ2,…,lnλKu\]\.\\mathbf\{u\}=\\left\[\\ln\\lambda\_\{1\},\\ \\ln\\lambda\_\{2\},\\ \\ldots,\\ \\ln\\lambda\_\{K\_\{\\mathrm\{u\}\}\}\\right\]\.
The above procedure is applied consistently across different conformational states\. The only difference lies in how the node set and the undirected interaction edge setℰ\\mathcal\{E\}are constructed\. We build descriptors for the unbound ligand, the unbound protein, and the bound complex, yielding𝐮lig,𝐮pro,𝐮cpx\\mathbf\{u\}\_\{\\text\{lig\}\},\\ \\mathbf\{u\}\_\{\\text\{pro\}\},\\ \\mathbf\{u\}\_\{\\text\{cpx\}\}respectively\.
#### 3\.1\.1Unbound Ligand
The node set𝒱lig\{\{\\mathcal\{V\}\}\_\{\\text\{lig\}\}\}consists of ligand atoms, and the interaction setℰlig\{\{\\mathcal\{E\}\}\_\{\\text\{lig\}\}\}contains intra\-ligand covalent bonds\. The ligand Hessian and its spectral feature are
𝐇lig=𝐇\(𝒱lig,ℰlig\),𝐮lig=\[lnλ1lig,…,lnλKullig\]\.\\mathbf\{H\}\_\{\\text\{lig\}\}=\\mathbf\{H\}\(\\mathcal\{V\}\_\{\\text\{lig\}\},\\mathcal\{E\}\_\{\\text\{lig\}\}\),\\quad\\mathbf\{u\}\_\{\\text\{lig\}\}=\\left\[\\ln\\lambda\_\{1\}^\{\\text\{lig\}\},\\ldots,\\ln\\lambda\_\{K\_\{\\mathrm\{ul\}\}\}^\{\\text\{lig\}\}\\right\]\.\(6\)This feature characterizes the intrinsic curvature spectrum and low\-frequency flexibility of the ligand in the unbound state\.
#### 3\.1\.2Unbound Protein
The node set𝒱pro\{\{\\mathcal\{V\}\}\_\{\\text\{pro\}\}\}consists of protein residues represented byCαC\_\{\\alpha\}atoms\. The interaction setℰpro\{\{\\mathcal\{E\}\}\_\{\\text\{pro\}\}\}is defined on the residue graph by projecting atomic covalent connectivity onto residue\-residue interactions\. Accordingly,
𝐇pro=𝐇\(𝒱pro,ℰpro\),𝐮pro=\[lnλ1pro,…,lnλKuppro\]\.\\mathbf\{H\}\_\{\\text\{pro\}\}=\\mathbf\{H\}\(\\mathcal\{V\}\_\{\\text\{pro\}\},\\mathcal\{E\}\_\{\\text\{pro\}\}\),\\quad\\mathbf\{u\}\_\{\\text\{pro\}\}=\\left\[\\ln\\lambda\_\{1\}^\{\\text\{pro\}\},\\ldots,\\ln\\lambda\_\{K\_\{\\mathrm\{up\}\}\}^\{\\text\{pro\}\}\\right\]\.\(7\)This feature captures the intrinsic curvature spectrum of the protein pocket in the unbound state\.
#### 3\.1\.3Bound Complex
For the bound complex, ligand and protein nodes are unified as𝒱cpx=𝒱lig∪𝒱pro\{\{\\mathcal\{V\}\}\_\{\\text\{cpx\}\}\}=\{\{\\mathcal\{V\}\}\_\{\\text\{lig\}\}\}\\cup\{\{\\mathcal\{V\}\}\_\{\\text\{pro\}\}\}, and the interaction set includes intra\-ligand interactionsℰlig\{\{\\mathcal\{E\}\}\_\{\\text\{lig\}\}\}, intra\-protein interactionsℰpro\{\{\\mathcal\{E\}\}\_\{\\text\{pro\}\}\}, and inter\-molecular interactionsℰinter\{\{\\mathcal\{E\}\}\_\{\\text\{inter\}\}\}\. The inter\-molecular set is defined by a distance cutoff hyperparameterdcutoff\{\{d\}\_\{\\text\{cutoff\}\}\}, i\.e\.,
ℰinter=\{\(i,j\)\|i∈𝒱lig,j∈𝒱pro,∥𝐫i−𝐫j∥<dcutoff\}\.\\mathcal\{E\}\_\{\\text\{inter\}\}=\\left\\\{\(i,j\)\\;\\middle\|\\;i\\in\\mathcal\{V\}\_\{\\text\{lig\}\},\\;j\\in\\mathcal\{V\}\_\{\\text\{pro\}\},\\;\\lVert\\mathbf\{r\}\_\{i\}\-\\mathbf\{r\}\_\{j\}\\rVert<d\_\{\\text\{cutoff\}\}\\right\\\}\.\(8\)Thusℰcpx=ℰlig∪ℰpro∪ℰinter\{\{\\mathcal\{E\}\}\_\{\\text\{cpx\}\}\}=\{\{\\mathcal\{E\}\}\_\{\\text\{lig\}\}\}\\cup\{\{\\mathcal\{E\}\}\_\{\\text\{pro\}\}\}\\cup\{\{\\mathcal\{E\}\}\_\{\\text\{inter\}\}\}and the bound complex Hessian is constructed
𝐇cpx=𝐇\(𝒱cpx,ℰcpx\),𝐮cpx=\[lnλ1cpx,…,lnλKuccpx\]\.\\mathbf\{H\}\_\{\\text\{cpx\}\}=\\mathbf\{H\}\(\\mathcal\{V\}\_\{\\text\{cpx\}\},\\mathcal\{E\}\_\{\\text\{cpx\}\}\),\\quad\\mathbf\{u\}\_\{\\text\{cpx\}\}=\\left\[\\ln\\lambda\_\{1\}^\{\\text\{cpx\}\},\\ldots,\\ln\\lambda\_\{K\_\{\\mathrm\{uc\}\}\}^\{\\text\{cpx\}\}\\right\]\.\(9\)Since𝐇cpx\{\{\\mathbf\{H\}\}\_\{\\text\{cpx\}\}\}explicitly introduces ligand\-protein couplings, the graph\-level feature𝐮cpx\{\{\\mathbf\{u\}\}\_\{\\text\{cpx\}\}\}represents the curvature spectrum of the bound\-state ligand\-protein complex rather than a trivial concatenation of unbound spectra of them\.
To ensure a consistent dimensionality of the spectral descriptors, we apply a padding strategy when the number of non\-zero eigenvalues is smaller than the predefined spectral size \(i\.e\.,KulK\_\{\\mathrm\{ul\}\},KupK\_\{\\mathrm\{up\}\}, orKucK\_\{\\mathrm\{uc\}\}\)\. Specifically, if the available non\-zero eigenvalues are insufficient, the remaining entries are filled by repeating the last \(i\.e\., largest\-index\) non\-zero eigenvalue\.
### 3\.2Cross\-Attention on Binding\-Induced Conformational Changes
To characterize binding\-induced conformational changes, we introduce a spectral cross\-attention mechanism that explicitly aligns and contrasts conformational responses among the unbound ligand, the unbound protein, and the bound ligand\-protein complex\. Unlike representations derived purely from static structures, this mechanism focuses on binding\-induced changes in conformational response patterns, i\.e\., how the system redistributes its sensitivity to dynamic perturbations across conformational directions\. Physically, such differences can be interpreted as changes in the curvature of the potential energy surface, and the resulting attention weights provide a data\-driven indication of which conformational directions are most strongly reshaped upon binding\.
Letdddenote the hidden dimension\. We construct spectral representations for the ligand, protein, and complex as described above
𝐮lig∈ℝKul,𝐮pro∈ℝKup,𝐮cpx∈ℝKuc,\\mathbf\{u\}\_\{\\text\{lig\}\}\\in\\mathbb\{R\}^\{K\_\{\\mathrm\{ul\}\}\},\\quad\\mathbf\{u\}\_\{\\text\{pro\}\}\\in\\mathbb\{R\}^\{K\_\{\\mathrm\{up\}\}\},\\quad\\mathbf\{u\}\_\{\\text\{cpx\}\}\\in\\mathbb\{R\}^\{K\_\{\\mathrm\{uc\}\}\},whereKulK\_\{\\mathrm\{ul\}\},KupK\_\{\\mathrm\{up\}\},KucK\_\{\\mathrm\{uc\}\}denote the numbers of retained spectral modes\. Intuitively, each spectral component characterizes the stiffness or response magnitude along a particular conformational direction, facilitating quantitative comparison of binding\-induced differences in conformational responses\. Each spectral scalar is treated as a token and embedded into add\-dimensional feature space using a multi\-layer perceptron \(MLP\), yielding
𝐔lig∈ℝKul×d,𝐔pro∈ℝKup×d,𝐔cpx∈ℝKuc×d\.\\mathbf\{U\}\_\{\\text\{lig\}\}\\in\\mathbb\{R\}^\{K\_\{\\mathrm\{ul\}\}\\times d\},\\quad\\mathbf\{U\}\_\{\\text\{pro\}\}\\in\\mathbb\{R\}^\{K\_\{\\mathrm\{up\}\}\\times d\},\\quad\\mathbf\{U\}\_\{\\text\{cpx\}\}\\in\\mathbb\{R\}^\{K\_\{\\mathrm\{uc\}\}\\times d\}\.
To contrast conformational response patterns before binding \(ligand or protein\) and after binding \(complex\), we apply spectral cross\-attention, where ligand or protein spectra serve as query and complex spectra serve as key and value\. Fors∈\{lig,pro\}s\\in\\\{\\text\{lig\},\\text\{pro\}\\\}, the query, key, and value matrices are given by
𝐐s\\displaystyle\\mathbf\{Q\}\_\{s\}=𝐔s𝐖Q\(s\)∈ℝKus×d,\\displaystyle=\\mathbf\{U\}\_\{s\}\\mathbf\{W\}\_\{Q\}^\{\(s\)\}\\in\\mathbb\{R\}^\{K\_\{\\mathrm\{us\}\}\\times d\},𝐊cpx\(s\)\\displaystyle\\mathbf\{K\}\_\{\\text\{cpx\}\}^\{\(s\)\}=𝐔cpx𝐖K\(s\)∈ℝKuc×d,\\displaystyle=\\mathbf\{U\}\_\{\\text\{cpx\}\}\\mathbf\{W\}\_\{K\}^\{\(s\)\}\\in\\mathbb\{R\}^\{K\_\{\\mathrm\{uc\}\}\\times d\},𝐕cpx\(s\)\\displaystyle\\mathbf\{V\}\_\{\\text\{cpx\}\}^\{\(s\)\}=𝐔cpx𝐖V\(s\)∈ℝKuc×d,\\displaystyle=\\mathbf\{U\}\_\{\\text\{cpx\}\}\\mathbf\{W\}\_\{V\}^\{\(s\)\}\\in\\mathbb\{R\}^\{K\_\{\\mathrm\{uc\}\}\\times d\},whereKus=Kulfors=lig,Kus=Kupfors=proK\_\{\\mathrm\{us\}\}=K\_\{\\mathrm\{ul\}\}\\text\{ for \}s=\\text\{lig\},\\quad K\_\{\\mathrm\{us\}\}=K\_\{\\mathrm\{up\}\}\\text\{ for \}s=\\text\{pro\}\.𝐖Q\(s\),𝐖K\(s\),𝐖V\(s\)∈ℝd×d\\mathbf\{W\}\_\{Q\}^\{\(s\)\},\\mathbf\{W\}\_\{K\}^\{\(s\)\},\\mathbf\{W\}\_\{V\}^\{\(s\)\}\\in\\mathbb\{R\}^\{d\\times d\}are learnable weight matrices\. The spectral cross\-attention is then computed as
𝐔~s→c\\displaystyle\\widetilde\{\\mathbf\{U\}\}\_\{s\\to\\text\{c\}\}=softmax\(𝐐s𝐊cpx\(s\)⊤d\)𝐕cpx\(s\)∈ℝKus×d\.\\displaystyle=\\operatorname\{softmax\}\\\!\\left\(\\frac\{\\mathbf\{Q\}\_\{s\}\\,\\mathbf\{K\}\_\{\\text\{cpx\}\}^\{\(s\)\\top\}\}\{\\sqrt\{d\}\}\\right\)\\,\\mathbf\{V\}\_\{\\text\{cpx\}\}^\{\(s\)\}\\in\\mathbb\{R\}^\{K\_\{\\mathrm\{us\}\}\\times d\}\.This formulation uses the bound complex as a reference to match and reweight ligand or protein contributions across spectral modes, thereby emphasizing conformational directions that are most altered upon binding\. In this way, the attention distribution implicitly captures structured differences in conformational responses and, consequently, in the curvature of the potential energy surface between the unbound components and the bound complex\.
Finally, we aggregate the sequence\-level responses into a graph\-level vector and combine them with a query\-side residual, resulting in a potential energy surface \(PES\) curvature\-aware representation
𝐠CurvPES\(s\)=MLP\(∑i=1Kus𝐔~s→c,i\)\+∑i=1Kus𝐔s,i∈ℝd\.\\mathbf\{g\}\_\{\\text\{CurvPES\}\}^\{\(s\)\}=\\operatorname\{MLP\}\\\!\\left\(\\sum\_\{i=1\}^\{K\_\{\\mathrm\{us\}\}\}\\widetilde\{\\mathbf\{U\}\}\_\{s\\to\\text\{c\},i\}\\right\)\+\\sum\_\{i=1\}^\{K\_\{\\mathrm\{us\}\}\}\\mathbf\{U\}\_\{s,i\}\\in\\mathbb\{R\}^\{d\}\.\(10\)This representation summarizes binding\-induced changes in conformational response patterns and captures interpretable features related to variations in energy curvature\.
Table 1:Performance Comparison of CPES and Baselines on the 2013 core set, 2016 core set, and the 2019 holdout set
### 3\.3The Framework of CPES
An overview of the CPES framework is illustrated in Fig\.[1](https://arxiv.org/html/2606.14217#S1.F1)\. CPES follows a dual\-branch architecture that integrates geometry\-aware static interaction modeling with curvature\-informed potential energy surface \(PES\) dynamics\. Given a complex graph𝒢=\(𝒱,ℰ\)\\mathcal\{G\}=\(\\mathcal\{V\},\\mathcal\{E\}\)the node set is defined as𝒱=𝒱lig∪𝒱pro\\mathcal\{V\}=\\mathcal\{V\}\_\{\\text\{lig\}\}\\cup\\mathcal\{V\}\_\{\\text\{pro\}\}, and the edge set asℰ=ℰintra∪ℰinter\\mathcal\{E\}=\\mathcal\{E\}\_\{\\mathrm\{intra\}\}\\cup\\mathcal\{E\}\_\{\\mathrm\{inter\}\}\(whereℰintra=ℰlig∪ℰpro\{\{\\mathcal\{E\}\}\_\{\\mathrm\{intra\}\}\}=\{\{\\mathcal\{E\}\}\_\{\\text\{lig\}\}\}\\cup\{\{\\mathcal\{E\}\}\_\{\\text\{pro\}\}\}\), representing intra\-molecular and inter\-molecular interactions, respectively\. Each atom is associated with an initial feature vector and a three\-dimensional coordinate\. These atom\-level features are first mapped into a hidden representation space through a learnable embedding layer\.
The static branch employs graph neural networks \(GNNs\) over both intra\-molecular and inter\-molecular edges of the complex graph to capture structural interaction information\. Based on the atom\-level representations, soft clustering is applied separately to the ligand and protein subgraphs to obtain hierarchical cluster\-level features, denoted as𝐙lig\\mathbf\{Z\}\_\{\\text\{lig\}\}and𝐙pro\\mathbf\{Z\}\_\{\\text\{pro\}\}\. A bidirectional cross\-attention module is then introduced between ligand and protein clusters to capture key structural interaction regions, yielding two graph\-level static representations𝐠static\(lig\)\\mathbf\{g\}\_\{\\text\{static\}\}^\{\(\\text\{lig\}\)\}and𝐠static\(pro\)\\mathbf\{g\}\_\{\\text\{static\}\}^\{\(\\text\{pro\}\)\}\(see Supplement[3](https://arxiv.org/html/2606.14217#S3a)for detailed formulations and architectural descriptions\)\.
In parallel, the curvature\-informed PES branch uses the Hessian eigenspectrum to characterize conformational flexibility and collective dynamics beyond static interaction geometry\. As defined in Eq\. \([10](https://arxiv.org/html/2606.14217#S3.E10)\), this branch compares the unbound ligand and protein with the bound complex in the spectral space, producing𝐠CurvPES\(lig\)\\mathbf\{g\}\_\{\\text\{CurvPES\}\}^\{\(\\text\{lig\}\)\}and𝐠CurvPES\(pro\)\\mathbf\{g\}\_\{\\text\{CurvPES\}\}^\{\(\\text\{pro\}\)\}to describe curvature\-based conformational response differences upon binding\.
Finally, CPES fuses the static and curvature\-informed dynamic representations at the graph level, i\.e\.,
𝐠=𝐠static\(lig\)\+𝐠static\(pro\)\+𝐠CurvPES\(lig\)\+𝐠CurvPES\(pro\)\.\\mathbf\{g\}=\\mathbf\{g\}\_\{\\text\{static\}\}^\{\(\\text\{lig\}\)\}\+\\mathbf\{g\}\_\{\\text\{static\}\}^\{\(\\text\{pro\}\)\}\+\\mathbf\{g\}\_\{\\text\{CurvPES\}\}^\{\(\\text\{lig\}\)\}\+\\mathbf\{g\}\_\{\\text\{CurvPES\}\}^\{\(\\text\{pro\}\)\}\.\(11\)A final MLP\-based regressor maps the fused representation to the predicted binding affinityy^=f\(𝐠\)\\hat\{y\}=f\(\\mathbf\{g\}\)\. The model is trained by minimizing the mean squared error lossℒmse=1N∑n=1N\(y^n−yn\)2\{\{\\mathcal\{L\}\}\_\{\\text\{mse\}\}\}=\\frac\{1\}\{N\}\\sum\\nolimits\_\{n=1\}^\{N\}\{\{\{\(\{\{\{\\hat\{y\}\}\}\_\{n\}\}\-\{\{y\}\_\{n\}\}\)\}^\{2\}\}\}, withyny\_\{n\}andy^n\\hat\{y\}\_\{n\}denoting the experimental and predicted binding affinities of thenn\-th sample, respectively\.
### 3\.4Implementation Details
We implement CPES based on the PyTorch Geometric \(PyG\) framework, with PyTorch serving as the backend\. Experiments are conducted on NVIDIA GeForce RTX 4080 GPU with 16 GB of memory\. The model parameters are optimized using the Adam optimizer with an initial learning rate of1×10−41\\times 10^\{\-4\}and a weight decay of1×10−61\\times 10^\{\-6\}to mitigate overfitting\. To further stabilize the training process, we employ a learning rate scheduler, which monitors the validation performance and reduces the learning rate by a factor of0\.50\.5if no improvement is observed for5050consecutive epochs\. Source codes are available at https://github\.com/Peng\-Fei\-Sun/CPES\.
## 4Experiments
In this section, we first assess cross\-dataset generalization performance on widely used PDBbind benchmarks and the external CSAR NRC\-HiQ dataset\. We then conduct ablation studies to examine the contributions of the curvature\-informed PES descriptors and attention modules, followed by interpretability analyses that reveal how CPES captures both key binding sites and binding\-related dynamic modes\. Due to the limited space, additional experimental settings and extended results are provided in Supplement[4](https://arxiv.org/html/2606.14217#S4a)\.

Figure 2:Performance comparison of CPES on the CSAR NRC\-HiQ dataset with \(a\) RMSE and \(b\) Pearson\.Table 2:Ablation Study Results of CPES with Varying Settings on the 2013 core set, 2016 core set, and the 2019 holdout setPES: curvature\-informed descriptors derived from the potential energy surface \(PES\); Attn\-S: attention for the static \(S\) structural interactions; Attn\-D: attention for the dynamic \(D\) conformational modes\.
### 4\.1Cross\-Dataset Generalization
Datasets\.For a fair comparison, we strictly follow the experimental protocols used in prior works\. We adopt the PDBbind dataset and use the same data splits as in GIGN\[[30](https://arxiv.org/html/2606.14217#bib.bib28)\]and CheapNet\[[13](https://arxiv.org/html/2606.14217#bib.bib29)\]\. All baseline models as well as our proposed CPES are trained and evaluated under this unified data splitting scheme, ensuring a strictly controlled and fair comparison across different methods\. This setup enables a reliable assessment of cross\-dataset generalization performance\.
We train the model on 11,903 samples from the PDBbind 2016 general set, with 1,000 samples used for validation\. The evaluation is conducted on three widely used benchmark test sets: \(i\) the CASF\-2013 benchmark \(107 samples\), \(ii\) the CASF\-2016 benchmark \(285 samples\), both derived from the PDBbind core set, and \(iii\) a PDBbind 2019 holdout test set consisting of 4,366 complexes that are non\-overlapping with the aforementioned splits\.
Comparison with Baselines\.We compare CPES with a diverse set of representative baselines, covering interaction\-free methods, interaction\-based methods, and interaction\-based methods with attention mechanisms\. The interaction\-free category includes DeepDTA\[[16](https://arxiv.org/html/2606.14217#bib.bib16)\], GraphDTA \(with GCN, GAT, GIN, and GAT\-GCN\)\[[15](https://arxiv.org/html/2606.14217#bib.bib17)\], MGraphDTA\[[31](https://arxiv.org/html/2606.14217#bib.bib18)\]\. The interaction\-based category comprises RF\-Score\[[2](https://arxiv.org/html/2606.14217#bib.bib30)\], Pafnucy\[[22](https://arxiv.org/html/2606.14217#bib.bib10)\], OnionNet\[[34](https://arxiv.org/html/2606.14217#bib.bib31)\], PotentialNet\[[5](https://arxiv.org/html/2606.14217#bib.bib19)\], SchNet\[[19](https://arxiv.org/html/2606.14217#bib.bib20)\], GNN\-DTI\[[14](https://arxiv.org/html/2606.14217#bib.bib21)\], IGN\[[7](https://arxiv.org/html/2606.14217#bib.bib22)\], EGNN\[[17](https://arxiv.org/html/2606.14217#bib.bib23)\], GIGN\[[30](https://arxiv.org/html/2606.14217#bib.bib28)\], MetalProGNet\[[8](https://arxiv.org/html/2606.14217#bib.bib32)\], SS\-GNN\[[33](https://arxiv.org/html/2606.14217#bib.bib33)\], and EHIGN\[[29](https://arxiv.org/html/2606.14217#bib.bib34)\]\. The interaction\-based methods with attention mechanisms include AttentionSiteDTI\[[32](https://arxiv.org/html/2606.14217#bib.bib24)\], CAPLA\[[9](https://arxiv.org/html/2606.14217#bib.bib25)\], GAABind\[[23](https://arxiv.org/html/2606.14217#bib.bib26)\], DEAttentionDTA\[[3](https://arxiv.org/html/2606.14217#bib.bib27)\], and CheapNet\[[13](https://arxiv.org/html/2606.14217#bib.bib29)\]\. These methods further introduce attention\-based designs to enhance the modeling of important interaction patterns or informative regions\. Our proposed CPES belongs to this family but further integrates curvature\-informed dynamic descriptors, enabling more effective modeling of both static interaction patterns and binding\-related conformational compatibility\. The results are summarized in Table[1](https://arxiv.org/html/2606.14217#S3.T1)\. As shown, CPES consistently achieves the best overall performance across the three benchmark test sets\.
### 4\.2External Evaluation on Non\-PDBbind Test Set
We further evaluate CPES on the CSAR NRC\-HiQ dataset\[[11](https://arxiv.org/html/2606.14217#bib.bib35)\], which serves as an external benchmark for protein\-ligand binding affinity prediction outside the PDBbind dataset\. Following the evaluation protocol of CheapNet\[[13](https://arxiv.org/html/2606.14217#bib.bib29)\], we remove complexes that cannot be processed by RDKit and exclude those overlapping with the training data, resulting in 14 samples for evaluation\. Fig\.[2](https://arxiv.org/html/2606.14217#S4.F2)presents the comparison between CPES and other interaction\-based methods\.
CPES achieves the best performance among all compared models, with an RMSE of 1\.102 and a Pearson correlation coefficient of 0\.938, consistently outperforming existing approaches on both metrics\. These results highlight the strong generalization capability of CPES, which can be attributed to its curvature\-informed inductive bias on conformational dynamics, enabling effective modeling of complex protein\-ligand interactions in external datasets\.
Figure 3:Visualization of cross\-attention across static structures and dynamic modes for the complex with PDB ID 3myg\. \(a,b\) Bidirectional cross\-attention maps between ligand and protein clusters in the static structural representation\. \(c,d\) Cross\-attention maps from ligand modes and protein modes to complex modes in the dynamic representation\. \(e,f\) Mean attention distributions across complex modes for ligand\-to\-complex \(L2C\) and protein\-to\-complex \(P2C\) attention, together with their cumulative attention curves\. In both cases, the attention is mainly concentrated on the low\-index complex modes, suggesting that the model preferentially focuses on a small number of dominant collective motions\.
### 4\.3Ablation Study
To evaluate the contribution of each component in CPES, we conduct an ablation study by systematically removing the curvature\-informed descriptors derived from the potential energy surface \(PES\), the static attention module \(Attn\-S\) that captures structure\-based interactions at the protein–ligand interface, and the dynamic attention module \(Attn\-D\) that models conformational flexibility through spectral representations of collective modes\. The results on the 2013 core set, 2016 core set, and the 2019 holdout set are summarized in Table[2](https://arxiv.org/html/2606.14217#S4.T2)\.
Overall, CPES consistently achieves the best performance across all datasets, demonstrating the effectiveness of jointly modeling static interactions and dynamic information under the conformational curvature\-informed inductive bias\. First, removing all three components \(Variant 1\) leads to the worst performance, indicating that the baseline model without these mechanisms is insufficient to capture the complexity of protein\-ligand binding\. Introducing the static attention module alone \(Variant 2\) yields noticeable improvements over Variant 1, suggesting that modeling static interactions at the binding interface is beneficial\. Similarly, incorporating only the PES descriptors \(Variant 3\) also improves performance, highlighting the importance of curvature\-informed geometric features\. When both PES and Attn\-S are included \(Variant 4\), the performance is further enhanced, indicating that curvature\-aware representations and static interaction modeling are complementary\. On the other hand, introducing the dynamic attention module \(Variant 5\) leads to additional gains, demonstrating the value of capturing conformational flexibility through spectral representations\. Notably, the incorporation of PES consistently improves performance across all variants, indicating that curvature\-informed descriptors provide meaningful inductive bias beyond purely geometric representations\. Moreover, the dynamic attention module further enhances performance, particularly on the 2019 holdout set, suggesting its effectiveness in capturing binding\-induced flexibility and improving generalization to unseen complexes\.
Figure 4:Scatter plots of the attention mode indices for the L2C and P2C dynamic modes cross\-attention on the \(a\) 2013 core set, \(b\) 2016 core set, and \(c\) 2019 holdout set\. Blue points denote the peak mode indices of the mean attention distributions, while gray points denote the mode indices at which the cumulative attention reaches 80%\. Marginal histograms on the top and right show the corresponding distributions along the L2C and P2C axes, respectively\.
### 4\.4Interpretability of CPES
To investigate whether the improved performance of CPES is accompanied by meaningful physical interpretability, we analyze the cross\-attention patterns in both the static structural space and the dynamic modal space\. We explicitly combines structural interaction learning with conformational curvature\-informed dynamical descriptors\. The attention maps offer a direct way to examine how these two information streams contribute to affinity prediction\.
Fig\.[3](https://arxiv.org/html/2606.14217#S4.F3)\(a\) and \(b\) illustrate the bidirectional cross\-attention between ligand and protein clusters in the static structural representation\. The horizontal and vertical axes correspond to cluster indices, since atoms are grouped into clusters via soft clustering in the preceding representation learning stage\. Fig\.[3](https://arxiv.org/html/2606.14217#S4.F3)\(a\) shows the ligand\-to\-protein attention, where each row of the heatmap is normalized to sum to one, indicating how each ligand cluster distributes its attention over protein clusters\. Fig\.[3](https://arxiv.org/html/2606.14217#S4.F3)\(b\), on the other hand, presents the protein\-to\-ligand attention, where each column is normalized, reflecting how each protein cluster allocates attention to ligand clusters\. The cross\-attention maps of static structural representation is sparse and nonuniform, with only a limited number of cluster pairs receiving strong responses\. This indicates that the static structural component of CPES identifies a subset of structurally important interaction regions at the binding interface\. Such selective emphasis is desirable, since protein\-ligand recognition is typically dominated by a limited number of critical local interactions\.
By comparison, Fig\.[3](https://arxiv.org/html/2606.14217#S4.F3)\(c\) and \(d\) exhibit a notably different pattern in the dynamic modal space\. For clearer visualization and compact figure presentation, the padded ligand and protein modes are removed\. Instead of the discrete, particle\-like distributions observed in Fig\.[3](https://arxiv.org/html/2606.14217#S4.F3)\(a, b\), the attention maps in Fig\.[3](https://arxiv.org/html/2606.14217#S4.F3)\(c, d\) appear smoother and more gradually varying along the complex mode axis, with attention predominantly concentrated in the low\-index modal region and then progressively decaying toward higher modes\. In other words, while the static attention mainly identifies localized structural binding sites, the dynamic attention reflects a broader distribution over the modal spectrum and emphasizes a small set of dominant low\-frequency collective motions\. This difference indicates that the two branches of CPES provide complementary interpretability: the static branch identifies key structural interaction regions, whereas the dynamic branch characterizes binding\-relevant dynamical coordination at the level of collective modes\.
Fig\.[3](https://arxiv.org/html/2606.14217#S4.F3)\(e\) and \(f\) aggregate the attention contributions from ligand modes or protein modes to each complex mode, resulting in mean attention distributions over the complex mode spectrum\. This mean attention can be interpreted as the overall level of attention that the unbound ligand or protein collectively assigns to each mode of the bound complex\. This compression reduces the detailed pairwise information in the original attention maps, but offers a clearer view of the overall distribution trend\. In addition, the cumulative attention curves further illustrate how the attention mass accumulates across the complex modes\. For both ligand\-to\-complex and protein\-to\-complex attention, the mean attention distributions exhibit peaks in the low\-index region of the complex mode spectrum, while the cumulative attention curves increase rapidly at early modes and then flatten progressively\. These trends show that most of the dynamical attention scores are captured by only a small number of low\-index complex modes\. Since low\-index modes usually correspond to slow, large\-scale, and cooperative motions, this suggests that CPES preferentially exploits the dominant collective motions that are most likely to be relevant to conformational adaptation and molecular recognition\.
To further summarize this behavior across test samples, we plot the peak attention mode index and the 80% cumulative attention index of the ligand\-to\-complex and protein\-to\-complex attentions in Fig\.[4](https://arxiv.org/html/2606.14217#S4.F4)\. The blue points represent the peak modes receiving the maximum mean attention, and the gray points represent the mode indices at which the cumulative attention reaches 80%\. It can be observed that the peak indices are consistently concentrated near the lowest\-index region, indicating that the strongest dynamic responses are dominated by early complex modes\. In addition, the scatter distributions exhibit a bias toward smaller L2C indices, suggesting stronger preference of the ligand\-to\-complex attention for low\-index complex modes\. Meanwhile, the 80% cumulative indices remain distributed within a relatively limited spectral range, suggesting that most of the attention mass can already be captured by only a subset of low\-frequency complex modes rather than being broadly distributed across the entire spectrum\. The marginal distributions further support this observation, showing that both the peak indices and the 80% cumulative indices remain concentrated within the low\-frequency region of the complex mode spectrum\. These results provide sample\-level evidence that CPES systematically emphasizes dominant collective motions of the bound complex during cross\-attention for dynamic modes\.
## 5Conclusion
In this work, we proposed CPES, a curvature\-informed potential energy surface graph neural network for protein\-ligand binding affinity prediction\. By introducing Hessian eigenspectrum descriptors derived from the anisotropic network model \(ANM\), our model incorporates curvature\-informed PES representations into molecular representation learning\. Extensive experiments across diverse benchmark datasets demonstrate that CPES achieves competitive predictive performance together with robust generalization capability\. Ablation studies further confirm the effectiveness of the PES curvature descriptors and both the static and dynamic attention modules\. Moreover, interpretability analyses reveal that the static branch identifies sparse binding\-relevant interaction regions, whereas the dynamic branch consistently focuses on low\-frequency collective modes of the bound complex, highlighting the importance of collective conformational dynamics in protein\-ligand binding\. The results demonstrate that PES curvature provides an effective physical inductive bias by characterizing molecular conformational flexibility and collective dynamics beyond static interaction geometry\. In future work, we will explore more accurate physical energy models and broader applications of curvature\-informed potential energy surface representations in structure\-based drug discovery tasks, such as flexible docking and virtual screening\.
## References
## References
- \[1\]K\. Atz, F\. Grisoni, and G\. Schneider\(2021\)Geometric deep learning on molecular representations\.Nature Machine Intelligence3\(12\),pp\. 1023–1032\.External Links:[Document](https://dx.doi.org/10.1038/s42256-021-00418-8)Cited by:[§1](https://arxiv.org/html/2606.14217#S1.p2.1)\.
- \[2\]P\. J\. Ballester and J\. B\. O\. Mitchell\(2010\)A machine learning approach to predicting protein–ligand binding affinity with applications to molecular docking\.Bioinformatics26\(9\),pp\. 1169–1175\.External Links:[Document](https://dx.doi.org/10.1093/bioinformatics/btq112)Cited by:[Table 1](https://arxiv.org/html/2606.14217#S3.T1.6.6.14.8.2),[§4\.1](https://arxiv.org/html/2606.14217#S4.SS1.p3.1)\.
- \[3\]X\. Chen, J\. Huang, T\. Shen, H\. Zhang, L\. Xu, M\. Yang, X\. Xie, Y\. Yan, and J\. Yan\(2024\)DEAttentionDTA: protein–ligand binding affinity prediction based on dynamic embedding and self\-attention\.Bioinformatics40\(6\),pp\. btae319\.External Links:[Document](https://dx.doi.org/10.1093/bioinformatics/btae319)Cited by:[§1](https://arxiv.org/html/2606.14217#S1.p2.1),[Table 1](https://arxiv.org/html/2606.14217#S3.T1.6.6.29.23.1),[§4\.1](https://arxiv.org/html/2606.14217#S4.SS1.p3.1)\.
- \[4\]A\. Dhakal, C\. McKay, J\. J\. Tanner, and J\. Cheng\(2022\)Artificial intelligence in the prediction of protein–ligand interactions: recent advances and future directions\.Briefings in Bioinformatics23\(1\),pp\. bbab476\.External Links:[Document](https://dx.doi.org/10.1093/bib/bbab476)Cited by:[§1](https://arxiv.org/html/2606.14217#S1.p1.2)\.
- \[5\]E\. N\. Feinberg, D\. Sur, Z\. Wu, B\. E\. Husic, H\. Mai, Y\. Li, S\. Sun, J\. Yang, B\. Ramsundar, and V\. S\. Pande\(2018\)PotentialNet for Molecular Property Prediction\.ACS Central Science4\(11\),pp\. 1520–1530\.External Links:[Document](https://dx.doi.org/10.1021/acscentsci.8b00507)Cited by:[§1](https://arxiv.org/html/2606.14217#S1.p2.1),[Table 1](https://arxiv.org/html/2606.14217#S3.T1.6.6.17.11.1),[§4\.1](https://arxiv.org/html/2606.14217#S4.SS1.p3.1)\.
- \[6\]M\. K\. Gilson and H\. Zhou\(2007\)Calculation of protein\-ligand binding affinities\.Annual Review of Biophysics36,pp\. 21–42\.External Links:[Document](https://dx.doi.org/https%3A//doi.org/10.1146/annurev.biophys.36.040306.132550)Cited by:[§1](https://arxiv.org/html/2606.14217#S1.p1.2)\.
- \[7\]D\. Jiang, C\. Hsieh, Z\. Wu, Y\. Kang, J\. Wang, E\. Wang, B\. Liao, C\. Shen, L\. Xu, J\. Wu, D\. Cao, and T\. Hou\(2021\)InteractionGraphNet: A Novel and Efficient Deep Graph Representation Learning Framework for Accurate Protein–Ligand Interaction Predictions\.Journal of Medicinal Chemistry64\(24\),pp\. 18209–18232\.External Links:[Document](https://dx.doi.org/10.1021/acs.jmedchem.1c01830)Cited by:[§1](https://arxiv.org/html/2606.14217#S1.p2.1),[Table 1](https://arxiv.org/html/2606.14217#S3.T1.6.6.20.14.1),[§4\.1](https://arxiv.org/html/2606.14217#S4.SS1.p3.1)\.
- \[8\]D\. Jiang, Z\. Ye, C\. Hsieh, Z\. Yang, X\. Zhang, Y\. Kang, H\. Du, Z\. Wu, J\. Wang, Y\. Zeng, H\. Zhang, X\. Wang, M\. Wang, X\. Yao, S\. Zhang, J\. Wu, and T\. Hou\(2023\)MetalProGNet: a structure\-based deep graph model for metalloprotein–ligand interaction predictions\.Chemical Science14\(8\),pp\. 2054–2069\.External Links:[Document](https://dx.doi.org/10.1039/D2SC06576B)Cited by:[Table 1](https://arxiv.org/html/2606.14217#S3.T1.6.6.23.17.1),[§4\.1](https://arxiv.org/html/2606.14217#S4.SS1.p3.1)\.
- \[9\]Z\. Jin, T\. Wu, T\. Chen, D\. Pan, X\. Wang, J\. Xie, L\. Quan, and Q\. Lyu\(2023\)CAPLA: improved prediction of protein–ligand binding affinity by a deep learning approach based on a cross\-attention mechanism\.Bioinformatics39\(2\),pp\. btad049\.External Links:[Document](https://dx.doi.org/10.1093/bioinformatics/btad049)Cited by:[§1](https://arxiv.org/html/2606.14217#S1.p2.1),[Table 1](https://arxiv.org/html/2606.14217#S3.T1.6.6.27.21.1),[§4\.1](https://arxiv.org/html/2606.14217#S4.SS1.p3.1)\.
- \[10\]W\. L\. Jorgensen\(2004\)The many roles of computation in drug discovery\.Science303\(5665\),pp\. 1813–1818\.External Links:[Document](https://dx.doi.org/10.1126/science.1096361)Cited by:[§1](https://arxiv.org/html/2606.14217#S1.p1.2)\.
- \[11\]J\. B\. Jr\. Dunbar, R\. D\. Smith, K\. L\. Damm\-Ganamet, A\. Ahmed, E\. X\. Esposito, J\. Delproposto, K\. Chinnaswamy, Y\. Kang, G\. Kubish, J\. E\. Gestwicki, J\. A\. Stuckey, and H\. A\. Carlson\(2013\)CSAR Data Set Release 2012: Ligands, Affinities, Complexes, and Docking Decoys\.Journal of Chemical Information and Modeling53\(8\),pp\. 1842–1852\.External Links:[Document](https://dx.doi.org/10.1021/ci4000486)Cited by:[§4\.2](https://arxiv.org/html/2606.14217#S4.SS2.p1.1)\.
- \[12\]D\. B\. Kitchen, H\. Decornez, J\. R\. Furr, and J\. Bajorath\(2004\)Docking and scoring in virtual screening for drug discovery: methods and applications\.Nature Reviews Drug Discovery3\(11\),pp\. 935–949\.External Links:[Document](https://dx.doi.org/10.1038/nrd1549)Cited by:[§1](https://arxiv.org/html/2606.14217#S1.p1.2)\.
- \[13\]H\. Lim, S\. Kim, and S\. Lee\(2025\)CheapNet: Cross\-attention on Hierarchical representations for Efficient protein\-ligand binding Affinity Prediction\.InThe Thirteenth International Conference on Learning Representations,Cited by:[§1](https://arxiv.org/html/2606.14217#S1.p2.1),[Table 1](https://arxiv.org/html/2606.14217#S3.T1.6.6.30.24.1),[§4\.1](https://arxiv.org/html/2606.14217#S4.SS1.p1.1),[§4\.1](https://arxiv.org/html/2606.14217#S4.SS1.p3.1),[§4\.2](https://arxiv.org/html/2606.14217#S4.SS2.p1.1)\.
- \[14\]J\. Lim, S\. Ryu, K\. Park, Y\. J\. Choe, J\. Ham, and W\. Y\. Kim\(2019\)Predicting Drug–Target Interaction Using a Novel Graph Neural Network with 3D Structure\-Embedded Graph Representation\.Journal of Chemical Information and Modeling59\(9\),pp\. 3981–3988\.External Links:[Document](https://dx.doi.org/10.1021/acs.jcim.9b00387)Cited by:[§1](https://arxiv.org/html/2606.14217#S1.p2.1),[Table 1](https://arxiv.org/html/2606.14217#S3.T1.6.6.19.13.1),[§4\.1](https://arxiv.org/html/2606.14217#S4.SS1.p3.1)\.
- \[15\]T\. Nguyen, H\. Le, T\. P\. Quinn, T\. Nguyen, T\. D\. Le, and S\. Venkatesh\(2021\)GraphDTA: predicting drug–target binding affinity with graph neural networks\.Bioinformatics37\(8\),pp\. 1140–1147\.External Links:[Document](https://dx.doi.org/10.1093/bioinformatics/btaa921)Cited by:[§1](https://arxiv.org/html/2606.14217#S1.p2.1),[Table 1](https://arxiv.org/html/2606.14217#S3.T1.6.6.10.4.1),[Table 1](https://arxiv.org/html/2606.14217#S3.T1.6.6.11.5.1),[Table 1](https://arxiv.org/html/2606.14217#S3.T1.6.6.12.6.1),[Table 1](https://arxiv.org/html/2606.14217#S3.T1.6.6.9.3.1),[§4\.1](https://arxiv.org/html/2606.14217#S4.SS1.p3.1)\.
- \[16\]H\. Öztürk, A\. Özgür, and E\. Ozkirimli\(2018\)DeepDTA: deep drug–target binding affinity prediction\.Bioinformatics34\(17\),pp\. i821–i829\.External Links:[Document](https://dx.doi.org/10.1093/bioinformatics/bty593)Cited by:[§1](https://arxiv.org/html/2606.14217#S1.p2.1),[Table 1](https://arxiv.org/html/2606.14217#S3.T1.6.6.8.2.2),[§4\.1](https://arxiv.org/html/2606.14217#S4.SS1.p3.1)\.
- \[17\]V\. G\. Satorras, E\. Hoogeboom, and M\. Welling\(2021\)E\(n\) Equivariant Graph Neural Networks\.InProceedings of the 38th International Conference on Machine Learning,pp\. 9323–9332\.Cited by:[§1](https://arxiv.org/html/2606.14217#S1.p2.1),[Table 1](https://arxiv.org/html/2606.14217#S3.T1.6.6.21.15.1),[§4\.1](https://arxiv.org/html/2606.14217#S4.SS1.p3.1)\.
- \[18\]J\. W\. Scannell, A\. Blanckley, H\. Boldon, and B\. Warrington\(2012\)Diagnosing the decline in pharmaceutical R&D efficiency\.Nature Reviews Drug Discovery11\(3\),pp\. 191–200\.External Links:[Document](https://dx.doi.org/10.1038/nrd3681)Cited by:[§1](https://arxiv.org/html/2606.14217#S1.p1.2)\.
- \[19\]K\. Sch U Tt, P\. Kindermans, H\. E\. Sauceda Felix, S\. Chmiela, A\. Tkatchenko, and K\. M U Ller\(2017\)SchNet: A continuous\-filter convolutional neural network for modeling quantum interactions\.InAdvances in Neural Information Processing Systems,Cited by:[§1](https://arxiv.org/html/2606.14217#S1.p2.1),[Table 1](https://arxiv.org/html/2606.14217#S3.T1.6.6.18.12.1),[§4\.1](https://arxiv.org/html/2606.14217#S4.SS1.p3.1)\.
- \[20\]I\. A\. Sedov and Y\. F\. Zuev\(2025\)Protein–ligand interactions: recent advances in biophysics, biochemistry, and bioinformatics\.International Journal of Molecular Sciences26\(19\),pp\. 9576\.External Links:[Document](https://dx.doi.org/10.3390/ijms26199576%20ER%20-)Cited by:[§1](https://arxiv.org/html/2606.14217#S1.p1.2)\.
- \[21\]D\. S\. Spassov\(2024\)Binding affinity determination in drug design: insights from lock and key, induced fit, conformational selection, and inhibitor trapping models\.International Journal of Molecular Sciences25\(13\),pp\. 7124\.External Links:[Document](https://dx.doi.org/10.3390/ijms25137124%20ER%20-)Cited by:[§1](https://arxiv.org/html/2606.14217#S1.p1.2)\.
- \[22\]M\. M\. Stepniewska\-Dziubinska, P\. Zielenkiewicz, and P\. Siedlecki\(2018\)Development and evaluation of a deep learning model for protein–ligand binding affinity prediction\.Bioinformatics34\(21\),pp\. 3666–3674\.External Links:[Document](https://dx.doi.org/10.1093/bioinformatics/bty374)Cited by:[§1](https://arxiv.org/html/2606.14217#S1.p1.2),[Table 1](https://arxiv.org/html/2606.14217#S3.T1.6.6.15.9.1),[§4\.1](https://arxiv.org/html/2606.14217#S4.SS1.p3.1)\.
- \[23\]H\. Tan, Z\. Wang, and G\. Hu\(2024\)GAABind: a geometry\-aware attention\-based network for accurate protein–ligand binding pose and binding affinity prediction\.Briefings in Bioinformatics25\(1\),pp\. bbad462\.External Links:[Document](https://dx.doi.org/10.1093/bib/bbad462)Cited by:[§1](https://arxiv.org/html/2606.14217#S1.p2.1),[Table 1](https://arxiv.org/html/2606.14217#S3.T1.6.6.28.22.1),[§4\.1](https://arxiv.org/html/2606.14217#S4.SS1.p3.1)\.
- \[24\]H\. Wang\(2024\)Prediction of protein–ligand binding affinity via deep learning models\.Briefings in Bioinformatics25\(2\),pp\. bbae081\.External Links:[Document](https://dx.doi.org/10.1093/bib/bbae081)Cited by:[§1](https://arxiv.org/html/2606.14217#S1.p2.1)\.
- \[25\]L\. Wang, Y\. Wu, Y\. Deng, B\. Kim, L\. Pierce, G\. Krilov, D\. Lupyan, S\. Robinson, M\. K\. Dahlgren, J\. Greenwood, D\. L\. Romero, C\. Masse, J\. L\. Knight, T\. Steinbrecher, T\. Beuming, W\. Damm, E\. Harder, W\. Sherman, M\. Brewer, R\. Wester, M\. Murcko, L\. Frye, R\. Farid, T\. Lin, D\. L\. Mobley, W\. L\. Jorgensen, B\. J\. Berne, R\. A\. Friesner, and R\. Abel\(2015\)Accurate and reliable prediction of relative ligand binding potency in prospective drug discovery by way of a modern free\-energy calculation protocol and force field\.Journal of the American Chemical Society137\(7\),pp\. 2695–2703\.External Links:[Document](https://dx.doi.org/10.1021/ja512751q)Cited by:[§1](https://arxiv.org/html/2606.14217#S1.p1.2)\.
- \[26\]Y\. Wang, S\. Wu, Y\. Duan, and Y\. Huang\(2022\)A point cloud\-based deep learning strategy for protein–ligand binding affinity prediction\.Briefings in Bioinformatics23\(1\),pp\. bbab474\.External Links:[Document](https://dx.doi.org/10.1093/bib/bbab474)Cited by:[§1](https://arxiv.org/html/2606.14217#S1.p1.2)\.
- \[27\]Y\. Wang, Q\. Jiao, J\. Wang, X\. Cai, W\. Zhao, and X\. Cui\(2023\)Prediction of protein\-ligand binding affinity with deep learning\.Computational and Structural Biotechnology Journal21,pp\. 5796–5806\.External Links:[Document](https://dx.doi.org/https%3A//doi.org/10.1016/j.csbj.2023.11.009)Cited by:[§1](https://arxiv.org/html/2606.14217#S1.p2.1)\.
- \[28\]M\. J\. Waring, J\. Arrowsmith, A\. R\. Leach, P\. D\. Leeson, S\. Mandrell, R\. M\. Owen, G\. Pairaudeau, W\. D\. Pennie, S\. D\. Pickett, J\. Wang, O\. Wallace, and A\. Weir\(2015\)An analysis of the attrition of drug candidates from four major pharmaceutical companies\.Nature Reviews Drug Discovery14\(7\),pp\. 475–486\.External Links:[Document](https://dx.doi.org/10.1038/nrd4609)Cited by:[§1](https://arxiv.org/html/2606.14217#S1.p1.2)\.
- \[29\]Z\. Yang, W\. Zhong, Q\. Lv, T\. Dong, G\. Chen, and C\. Y\. Chen\(2024\)Interaction\-Based Inductive Bias in Graph Neural Networks: Enhancing Protein\-Ligand Binding Affinity Predictions From 3D Structures\.IEEE Transactions on Pattern Analysis and Machine Intelligence46\(12\),pp\. 8191–8208\.External Links:[Document](https://dx.doi.org/10.1109/TPAMI.2024.3400515)Cited by:[Table 1](https://arxiv.org/html/2606.14217#S3.T1.6.6.25.19.1),[§4\.1](https://arxiv.org/html/2606.14217#S4.SS1.p3.1)\.
- \[30\]Z\. Yang, W\. Zhong, Q\. Lv, T\. Dong, and C\. Yu\-Chian Chen\(2023\)Geometric Interaction Graph Neural Network for Predicting Protein–Ligand Binding Affinities from 3D Structures \(GIGN\)\.The Journal of Physical Chemistry Letters14\(8\),pp\. 2020–2033\.External Links:[Document](https://dx.doi.org/10.1021/acs.jpclett.2c03906)Cited by:[§1](https://arxiv.org/html/2606.14217#S1.p2.1),[Table 1](https://arxiv.org/html/2606.14217#S3.T1.6.6.22.16.1),[§4\.1](https://arxiv.org/html/2606.14217#S4.SS1.p1.1),[§4\.1](https://arxiv.org/html/2606.14217#S4.SS1.p3.1)\.
- \[31\]Z\. Yang, W\. Zhong, L\. Zhao, and C\. Yu\-Chian Chen\(2022\)MGraphDTA: deep multiscale graph neural network for explainable drug–target binding affinity prediction\.Chemical Science13\(3\),pp\. 816–833\.External Links:[Document](https://dx.doi.org/10.1039/D1SC05180F)Cited by:[§1](https://arxiv.org/html/2606.14217#S1.p2.1),[Table 1](https://arxiv.org/html/2606.14217#S3.T1.6.6.13.7.1),[§4\.1](https://arxiv.org/html/2606.14217#S4.SS1.p3.1)\.
- \[32\]M\. Yazdani\-Jahromi, N\. Yousefi, A\. Tayebi, E\. Kolanthai, C\. J\. Neal, S\. Seal, and O\. O\. Garibay\(2022\)AttentionSiteDTI: an interpretable graph\-based model for drug\-target interaction prediction using NLP sentence\-level relation classification\.Briefings in Bioinformatics23\(4\),pp\. bbac272\.External Links:[Document](https://dx.doi.org/10.1093/bib/bbac272)Cited by:[§1](https://arxiv.org/html/2606.14217#S1.p2.1),[Table 1](https://arxiv.org/html/2606.14217#S3.T1.6.6.26.20.2),[§4\.1](https://arxiv.org/html/2606.14217#S4.SS1.p3.1)\.
- \[33\]S\. Zhang, Y\. Jin, T\. Liu, Q\. Wang, Z\. Zhang, S\. Zhao, and B\. Shan\(2023\)SS\-GNN: A Simple\-Structured Graph Neural Network for Affinity Prediction\.ACS Omega8\(25\),pp\. 22496–22507\.External Links:[Document](https://dx.doi.org/10.1021/acsomega.3c00085)Cited by:[Table 1](https://arxiv.org/html/2606.14217#S3.T1.6.6.24.18.1),[§4\.1](https://arxiv.org/html/2606.14217#S4.SS1.p3.1)\.
- \[34\]L\. Zheng, J\. Fan, and Y\. Mu\(2019\)OnionNet: a Multiple\-Layer Intermolecular\-Contact\-Based Convolutional Neural Network for Protein–Ligand Binding Affinity Prediction\.ACS Omega4\(14\),pp\. 15956–15965\.External Links:[Document](https://dx.doi.org/10.1021/acsomega.9b01997)Cited by:[Table 1](https://arxiv.org/html/2606.14217#S3.T1.6.6.16.10.1),[§4\.1](https://arxiv.org/html/2606.14217#S4.SS1.p3.1)\.
Curvature\-Informed Potential Energy Surface for Protein\-Ligand Binding Affinity Prediction \(Supplementary Information\)
This supplementary information provides additional discussions, theoretical derivations, architectural details, and extended experimental results that complement the main manuscript\. A broader discussion of related works is included, covering molecular dynamics and energy\-based approaches, geometric deep learning methods for protein\-ligand binding affinity prediction, as well as classical conformational dynamics methods such as normal mode analysis \(NMA\) and anisotropic network models \(ANM\), together with additional references\. Detailed derivations of the SE\(3\) equivariance property of the Hessian and the invariance of its eigenvalue spectrum under rigid\-body transformations are presented\. Additional descriptions of the CPES framework are also provided, particularly for the geometry\-aware static interaction branch\. More experimental details and extended results are further included, including dataset processing procedures, evaluation settings, experiments on dissimilar protein scenarios, and analyses of different graph construction strategies\.
## 1Additional Discussion on Related Works
### 1\.1Molecular Dynamics and Energy\-Based Approaches
Molecular dynamics \(MD\) simulations constitute a fundamental class of physics\-based approaches for modeling protein\-ligand binding by explicitly describing atomic motions under physically grounded force fields\. By integrating Newtonian equations of motion, MD enables the exploration of conformational ensembles and provides mechanistic insights into binding phenomena such as induced fit, conformational selection, and binding\-unbinding pathways\[[27](https://arxiv.org/html/2606.14217#biba.bib36),[12](https://arxiv.org/html/2606.14217#biba.bib37)\]\. Beyond classical MD, several complementary energy\-based approaches have been developed for protein\-ligand docking and binding affinity prediction\[[41](https://arxiv.org/html/2606.14217#biba.bib38)\]\. Quantum chemical methods, including density functional theory \(DFT\), model protein\-ligand interactions by explicitly describing electronic structure and polarization effects, but their high computational cost limits their applicability to small systems or localized regions\[[43](https://arxiv.org/html/2606.14217#biba.bib39),[47](https://arxiv.org/html/2606.14217#biba.bib40)\]\. Despite their physical rigor, these energy\-based methods typically require substantial computational resources and long simulation timescales, limiting their practicality for large\-scale screening or dataset\-level affinity prediction\[[2](https://arxiv.org/html/2606.14217#biba.bib41),[26](https://arxiv.org/html/2606.14217#biba.bib42),[59](https://arxiv.org/html/2606.14217#biba.bib43),[29](https://arxiv.org/html/2606.14217#biba.bib44)\]\. This limitation motivates the development of deep learning\-based approaches that incorporate physically meaningful energy\-inspired cues while maintaining scalability and efficiency\.
### 1\.2Geometric Deep Learning for Protein\-Ligand Binding Affinity Prediction
Geometric deep learning \(GDL\) has emerged as a powerful paradigm for learning from structured data by generalizing neural networks to non\-Euclidean domains such as graphs, manifolds, and meshes\[[38](https://arxiv.org/html/2606.14217#biba.bib45)\]\. Within this framework, graph neural networks \(GNNs\) have become the de facto choice for learning with graph\-structured data\[[7](https://arxiv.org/html/2606.14217#biba.bib46),[10](https://arxiv.org/html/2606.14217#biba.bib47),[30](https://arxiv.org/html/2606.14217#biba.bib48),[57](https://arxiv.org/html/2606.14217#biba.bib49),[34](https://arxiv.org/html/2606.14217#biba.bib50)\], owing to their ability to model relational information through message passing while incorporating geometric priors and symmetry properties\. By explicitly encoding invariance or equivariance to transformations such as translation, rotation, and permutation\[[49](https://arxiv.org/html/2606.14217#biba.bib51),[51](https://arxiv.org/html/2606.14217#biba.bib52),[18](https://arxiv.org/html/2606.14217#biba.bib53),[17](https://arxiv.org/html/2606.14217#biba.bib54),[14](https://arxiv.org/html/2606.14217#biba.bib55)\], geometric deep learning architectures achieve improved accuracy and data efficiency, particularly in physically grounded domains\.
Motivated by these properties, geometric deep learning has been widely adopted for protein\-ligand binding affinity prediction and other molecular modelling\[[21](https://arxiv.org/html/2606.14217#biba.bib56),[52](https://arxiv.org/html/2606.14217#biba.bib57),[19](https://arxiv.org/html/2606.14217#biba.bib58),[46](https://arxiv.org/html/2606.14217#biba.bib59)\]\. Graph\-based models represent protein\-ligand complexes as three\-dimensional graphs, where nodes correspond to atoms and edges encode covalent or non\-covalent interactions, enabling explicit modeling of spatial relationships and atomic interactions\. Compared with sequence\-based or descriptor\-driven approaches, such interaction\-based models introduce inductive biases that are more closely aligned with physical binding mechanisms and have demonstrated superior performance across diverse benchmarks\. Representative frameworks include interaction graph neural networks \(IGNNs\), such as GIGN\[[58](https://arxiv.org/html/2606.14217#biba.bib28)\], which integrate physically meaningful interactions into invariant message\-passing schemes, as well as hierarchical models like CheapNet\[[35](https://arxiv.org/html/2606.14217#biba.bib29)\]that aggregate atom\-level representations into cluster\-level interactions via differentiable pooling and cross\-attention\. Despite these advances, most existing geometric deep learning approaches still rely on static representations derived from a single bound conformation, implicitly treating protein\-ligand complexes as rigid entities and overlooking conformational flexibility and binding\-induced dynamics\. This limitation motivates the development of models that incorporate physically informed inductive biases beyond static interaction geometry\.
### 1\.3Inductive Bias from Conformational Energy Distributions
Inductive bias plays a central role in machine learning by shaping how models generalize beyond observed data\[[22](https://arxiv.org/html/2606.14217#biba.bib60),[32](https://arxiv.org/html/2606.14217#biba.bib61),[44](https://arxiv.org/html/2606.14217#biba.bib62)\]\. For protein\-ligand binding affinity prediction, inductive biases that reflect underlying physical and biological mechanisms are particularly important, as they encourage representations that capture causal structure\-function relationships rather than superficial correlations, among which conformational energy distributions play a central role in shaping molecular stability and adaptability\[[11](https://arxiv.org/html/2606.14217#biba.bib63),[33](https://arxiv.org/html/2606.14217#biba.bib64),[23](https://arxiv.org/html/2606.14217#biba.bib65)\]\.
From a biophysical perspective, protein\-ligand binding does not occur between rigid structures but rather involves ensembles of conformations characterized by an underlying potential energy function\. The accessibility of binding\-competent states and binding\-induced conformational changes is governed by the local curvature of the energy function in the vicinity of stable conformations\. While explicitly sampling full conformational energy distributions using molecular dynamics is computationally prohibitive at scale, the local behavior of the potential energy function already encodes essential information about molecular flexibility\. Classical approaches such as normal mode analysis \(NMA\)\[[50](https://arxiv.org/html/2606.14217#biba.bib66),[5](https://arxiv.org/html/2606.14217#biba.bib67),[39](https://arxiv.org/html/2606.14217#biba.bib68),[4](https://arxiv.org/html/2606.14217#biba.bib69),[54](https://arxiv.org/html/2606.14217#biba.bib70),[37](https://arxiv.org/html/2606.14217#biba.bib71)\]and elastic network models\[[3](https://arxiv.org/html/2606.14217#biba.bib72),[15](https://arxiv.org/html/2606.14217#biba.bib73),[25](https://arxiv.org/html/2606.14217#biba.bib74),[28](https://arxiv.org/html/2606.14217#biba.bib75)\]leverage this observation by approximating the local curvature of the potential energy function around equilibrium conformations\. By analyzing second\-order variations of the energy function, these methods characterize intrinsic stiffness, collective motions, and dominant deformation modes that are closely related to binding\-relevant conformational changes\. Importantly, such local energy\-derived descriptors provide compact, physically grounded summaries of molecular flexibility without requiring extensive sampling\.
Motivated by these principles, incorporating representations inspired by conformational energy distributions introduces an inductive bias that aligns learning objectives with physical binding mechanisms\[[9](https://arxiv.org/html/2606.14217#biba.bib76),[40](https://arxiv.org/html/2606.14217#biba.bib77),[8](https://arxiv.org/html/2606.14217#biba.bib78)\]\. Rather than relying solely on static interaction geometry, energy\-informed inductive biases enable models to capture how molecular systems respond to perturbations and adapt during binding, that is, conformational dynamic differences between bound and unbound states, thereby improving both generalization and interpretability in protein\-ligand binding affinity prediction\.
## 2Proof of SE\(3\) Equivariance of Hessian
Let𝐪∈ℝ3N\\mathbf\{q\}\\in\\mathbb\{R\}^\{3N\}be the configuration vector andV:ℝ3N→ℝV:\\mathbb\{R\}^\{3N\}\\to\\mathbb\{R\}the potential energy function\. Consider an arbitrary rotation matrix𝐑∈SO\(3\)\\mathbf\{R\}\\in\\mathrm\{SO\}\(3\)and a translation vector𝐭∈ℝ3\\mathbf\{t\}\\in\\mathbb\{R\}^\{3\}\. Define the block\-diagonal matrix
𝐐=diag\(𝐑,…,𝐑\)∈ℝ3N×3N,\\mathbf\{Q\}=\\operatorname\{diag\}\(\\mathbf\{R\},\\ldots,\\mathbf\{R\}\)\\in\\mathbb\{R\}^\{3N\\times 3N\},and the translation vector
𝐜=\(𝐭,…,𝐭\)∈ℝ3N\.\\mathbf\{c\}=\(\\mathbf\{t\},\\ldots,\\mathbf\{t\}\)\\in\\mathbb\{R\}^\{3N\}\.The global rigid\-body transformation is given by
𝐪′=𝐐𝐪\+𝐜\.\\mathbf\{q\}^\{\\prime\}=\\mathbf\{Q\}\\mathbf\{q\}\+\\mathbf\{c\}\.
Assume that the potential energy depends only on the internal geometric relations of the system and is therefore invariant under such a transformation, i\.e\.,
V′\(𝐪′\)≡V\(𝐪\)\.V^\{\\prime\}\(\\mathbf\{q\}^\{\\prime\}\)\\equiv V\(\\mathbf\{q\}\)\.Since𝐑∈SO\(3\)\\mathbf\{R\}\\in\\mathrm\{SO\}\(3\), the induced matrix𝐐\\mathbf\{Q\}is orthogonal and satisfies𝐐⊤𝐐=𝐈\\mathbf\{Q\}^\{\\top\}\\mathbf\{Q\}=\\mathbf\{I\}\. Hence the transformation is invertible, with inverse
𝐪=𝐐⊤\(𝐪′−𝐜\)\.\\mathbf\{q\}=\\mathbf\{Q\}^\{\\top\}\(\\mathbf\{q\}^\{\\prime\}\-\\mathbf\{c\}\)\.Consequently,V\(𝐪\)V\(\\mathbf\{q\}\)can be regarded as the composite functionV\(𝐪\(𝐪′\)\)V\(\\mathbf\{q\}\(\\mathbf\{q\}^\{\\prime\}\)\)\.
Taking the gradient of both sides of the identityV′\(𝐪′\)≡V\(𝐪\(𝐪′\)\)V^\{\\prime\}\(\\mathbf\{q\}^\{\\prime\}\)\\equiv V\(\\mathbf\{q\}\(\\mathbf\{q\}^\{\\prime\}\)\)with respect to𝐪′\\mathbf\{q\}^\{\\prime\}, the left\-hand side yields∇𝐪′V′\(𝐪′\)\\nabla\_\{\\mathbf\{q\}^\{\\prime\}\}V^\{\\prime\}\(\\mathbf\{q\}^\{\\prime\}\), while the right\-hand side is a scalar\-valued composite function\. By the chain rule for scalar functions \(with the convention that gradients are column vectors\), it follows that
∇𝐪′V\(𝐪\(𝐪′\)\)=\(∂𝐪∂𝐪′\)⊤∇𝐪V\(𝐪\)\.\\nabla\_\{\\mathbf\{q\}^\{\\prime\}\}V\\bigl\(\\mathbf\{q\}\(\\mathbf\{q\}^\{\\prime\}\)\\bigr\)=\\left\(\\frac\{\\partial\\mathbf\{q\}\}\{\\partial\\mathbf\{q\}^\{\\prime\}\}\\right\)^\{\\\!\\top\}\\nabla\_\{\\mathbf\{q\}\}V\(\\mathbf\{q\}\)\.From𝐪=𝐐⊤\(𝐪′−𝐜\)\\mathbf\{q\}=\\mathbf\{Q\}^\{\\top\}\(\\mathbf\{q\}^\{\\prime\}\-\\mathbf\{c\}\), the Jacobian matrix is constant and given by
∂𝐪∂𝐪′=𝐐⊤,\\frac\{\\partial\\mathbf\{q\}\}\{\\partial\\mathbf\{q\}^\{\\prime\}\}=\\mathbf\{Q\}^\{\\top\},where the translation vector𝐜\\mathbf\{c\}vanishes upon differentiation\. Substituting this result yields
∇𝐪′V′\(𝐪′\)=𝐐∇𝐪V\(𝐪\)\.\\nabla\_\{\\mathbf\{q\}^\{\\prime\}\}V^\{\\prime\}\(\\mathbf\{q\}^\{\\prime\}\)=\\mathbf\{Q\}\\,\\nabla\_\{\\mathbf\{q\}\}V\(\\mathbf\{q\}\)\.
Differentiating once more with respect to𝐪′\\mathbf\{q\}^\{\\prime\}gives
∇𝐪′2V′\(𝐪′\)=∇𝐪′\(𝐐∇𝐪V\(𝐪\)\)\.\\nabla\_\{\\mathbf\{q\}^\{\\prime\}\}^\{2\}V^\{\\prime\}\(\\mathbf\{q\}^\{\\prime\}\)=\\nabla\_\{\\mathbf\{q\}^\{\\prime\}\}\\\!\\left\(\\mathbf\{Q\}\\,\\nabla\_\{\\mathbf\{q\}\}V\(\\mathbf\{q\}\)\\right\)\.Since𝐐\\mathbf\{Q\}is independent of𝐪′\\mathbf\{q\}^\{\\prime\}, it can be taken outside the derivative
∇𝐪′2V′\(𝐪′\)=𝐐∇𝐪′\(∇𝐪V\(𝐪\)\)\.\\nabla\_\{\\mathbf\{q\}^\{\\prime\}\}^\{2\}V^\{\\prime\}\(\\mathbf\{q\}^\{\\prime\}\)=\\mathbf\{Q\}\\,\\nabla\_\{\\mathbf\{q\}^\{\\prime\}\}\\\!\\left\(\\nabla\_\{\\mathbf\{q\}\}V\(\\mathbf\{q\}\)\\right\)\.Note that∇𝐪V\(𝐪\)\\nabla\_\{\\mathbf\{q\}\}V\(\\mathbf\{q\}\)is a vector\-valued function fromℝ3N\\mathbb\{R\}^\{3N\}toℝ3N\\mathbb\{R\}^\{3N\}, and its first derivative with respect to𝐪′\\mathbf\{q\}^\{\\prime\}is a Jacobian matrix\. Applying the chain rule for vector\-valued composite functions, the following is derived
∇𝐪′\(∇𝐪V\(𝐪\(𝐪′\)\)\)=\(∇𝐪2V\(𝐪\)\)∂𝐪∂𝐪′\.\\nabla\_\{\\mathbf\{q\}^\{\\prime\}\}\\\!\\left\(\\nabla\_\{\\mathbf\{q\}\}V\(\\mathbf\{q\}\(\\mathbf\{q\}^\{\\prime\}\)\)\\right\)=\\left\(\\nabla\_\{\\mathbf\{q\}\}^\{2\}V\(\\mathbf\{q\}\)\\right\)\\frac\{\\partial\\mathbf\{q\}\}\{\\partial\\mathbf\{q\}^\{\\prime\}\}\.Substituting∂𝐪∂𝐪′=𝐐⊤\\frac\{\\partial\\mathbf\{q\}\}\{\\partial\\mathbf\{q\}^\{\\prime\}\}=\\mathbf\{Q\}^\{\\top\}yields
∇𝐪′2V′\(𝐪′\)=𝐐\(∇𝐪2V\(𝐪\)\)𝐐⊤\.\\nabla\_\{\\mathbf\{q\}^\{\\prime\}\}^\{2\}V^\{\\prime\}\(\\mathbf\{q\}^\{\\prime\}\)=\\mathbf\{Q\}\\left\(\\nabla\_\{\\mathbf\{q\}\}^\{2\}V\(\\mathbf\{q\}\)\\right\)\\mathbf\{Q\}^\{\\top\}\.
Defining𝐇\(𝐪\)=∇𝐪2V\(𝐪\)\\mathbf\{H\}\(\\mathbf\{q\}\)=\\nabla\_\{\\mathbf\{q\}\}^\{2\}V\(\\mathbf\{q\}\)and𝐇′\(𝐪′\)=∇𝐪′2V′\(𝐪′\)\\mathbf\{H\}^\{\\prime\}\(\\mathbf\{q\}^\{\\prime\}\)=\\nabla\_\{\\mathbf\{q\}^\{\\prime\}\}^\{2\}V^\{\\prime\}\(\\mathbf\{q\}^\{\\prime\}\), we finally obtain
𝐇′\(𝐪′\)=𝐐𝐇\(𝐪\)𝐐⊤\.\\mathbf\{H\}^\{\\prime\}\(\\mathbf\{q\}^\{\\prime\}\)=\\mathbf\{Q\}\\,\\mathbf\{H\}\(\\mathbf\{q\}\)\\,\\mathbf\{Q\}^\{\\top\}\.This completes the proof of SE\(3\) equivariance of the Hessian\.
## 3Details of the CPES Framework
Given a complex graph𝒢=\(𝒱,ℰ\)\\mathcal\{G\}=\(\\mathcal\{V\},\\mathcal\{E\}\)the node set is defined as𝒱=𝒱lig∪𝒱pro\\mathcal\{V\}=\\mathcal\{V\}\_\{\\text\{lig\}\}\\cup\\mathcal\{V\}\_\{\\text\{pro\}\}, and the edge set asℰ=ℰintra∪ℰinter\\mathcal\{E\}=\\mathcal\{E\}\_\{\\mathrm\{intra\}\}\\cup\\mathcal\{E\}\_\{\\mathrm\{inter\}\}\(whereℰintra=ℰlig∪ℰpro\{\{\\mathcal\{E\}\}\_\{\\mathrm\{intra\}\}\}=\{\{\\mathcal\{E\}\}\_\{\\text\{lig\}\}\}\\cup\{\{\\mathcal\{E\}\}\_\{\\text\{pro\}\}\}\), representing intra\-molecular and inter\-molecular interactions, respectively\. Each nodei∈𝒱i\\in\\mathcal\{V\}is associated with an initial feature vector𝐱i\\mathbf\{x\}\_\{i\}and a three\-dimensional coordinate𝐫i\\mathbf\{r\}\_\{i\}\. Node features are first embedded into add\-dimensional hidden space via a learnable embedding function
𝐡i\(0\)=MLPemb\(𝐱i\)∈ℝd\.\\mathbf\{h\}\_\{i\}^\{\(0\)\}=\\operatorname\{MLP\}\_\{\\text\{emb\}\}\(\\mathbf\{x\}\_\{i\}\)\\in\\mathbb\{R\}^\{d\}\.
In the static structural branch, geometry\-aware message passing is performed over both edge typesℰr\\mathcal\{E\}\_\{r\}withr∈\{intra,inter\}r\\in\\\{\\mathrm\{intra\},\\mathrm\{inter\}\\\}\. At layertt, a channel\-wise modulation vector induced by relative geometric relations is defined as
𝝆ij\(r\)=ψ\(r\)\(∥𝐫i−𝐫j∥\)∈ℝd,\\boldsymbol\{\\rho\}\_\{ij\}^\{\(r\)\}=\\operatorname\{\\psi\}^\{\(r\)\}\\\!\\left\(\\lVert\\mathbf\{r\}\_\{i\}\-\\mathbf\{r\}\_\{j\}\\rVert\\right\)\\in\\mathbb\{R\}^\{d\},whereψ\(r\)\(⋅\)\\psi^\{\(r\)\}\(\\cdot\)is a learnable mapping that transforms inter\-node distances into channel\-wise modulation coefficients\. The aggregated message for nodeiiunder edge typerris given by
𝐦i\(r,t\)=∑j:\(j,i\)∈ℰr𝐡j\(t\)⊙𝝆ji\(r\),\\mathbf\{m\}\_\{i\}^\{\(r,t\)\}=\\sum\_\{j:\(j,i\)\\in\\mathcal\{E\}\_\{r\}\}\\mathbf\{h\}\_\{j\}^\{\(t\)\}\\odot\\boldsymbol\{\\rho\}\_\{ji\}^\{\(r\)\},where⊙\\odotdenotes the element\-wise product, enabling geometry\-aware modulation of neighbor features\. Node representations are then updated via a MLP transformation with residual connection, i\.e\.,
𝐡i\(intra,t\+1\)\\displaystyle\\mathbf\{h\}\_\{i\}^\{\(\\mathrm\{intra\},t\+1\)\}=MLPintra\(𝐦i\(intra,t\)\)\+𝐡i\(t\),\\displaystyle=\\operatorname\{MLP\}\_\{\\mathrm\{intra\}\}\\\!\\left\(\\mathbf\{m\}\_\{i\}^\{\(\\mathrm\{intra\},t\)\}\\right\)\+\\mathbf\{h\}\_\{i\}^\{\(t\)\},𝐡i\(inter,t\+1\)\\displaystyle\\mathbf\{h\}\_\{i\}^\{\(\\mathrm\{inter\},t\+1\)\}=MLPinter\(𝐦i\(inter,t\)\)\+𝐡i\(t\)\.\\displaystyle=\\operatorname\{MLP\}\_\{\\mathrm\{inter\}\}\\\!\\left\(\\mathbf\{m\}\_\{i\}^\{\(\\mathrm\{inter\},t\)\}\\right\)\+\\mathbf\{h\}\_\{i\}^\{\(t\)\}\.The updates from intra\-molecular and inter\-molecular edges are fused at the node level as
𝐡i\(t\+1\)=12\(𝐡i\(intra,t\+1\)\+𝐡i\(inter,t\+1\)\)\.\\mathbf\{h\}\_\{i\}^\{\(t\+1\)\}=\\tfrac\{1\}\{2\}\\left\(\\mathbf\{h\}\_\{i\}^\{\(\\mathrm\{intra\},\\,t\+1\)\}\+\\mathbf\{h\}\_\{i\}^\{\(\\mathrm\{inter\},\\,t\+1\)\}\\right\)\.
To extract higher\-level, hierarchically organized representations from atom\-level features, differentiable clustering pooling is applied separately to the ligand and protein subgraphs\. Fors∈\{lig,pro\}s\\in\\\{\\text\{lig\},\\text\{pro\}\\\}, let
𝐅s∈ℝNs×d,𝐀s∈ℝNs×Ns,\\mathbf\{F\}\_\{s\}\\in\\mathbb\{R\}^\{N\_\{s\}\\times d\},\\qquad\\mathbf\{A\}\_\{s\}\\in\\mathbb\{R\}^\{N\_\{s\}\\times N\_\{s\}\},denote the node feature matrix and adjacency matrix, respectively\.NsN\_\{s\}denotes the number of nodes in the ligand or protein subgraph\. A soft assignment matrix is produced by a graph neural network \(GNN\)
𝐒s=softmax\(GNNs\(𝐅s,𝐀s\)\)∈ℝNs×Cs,\\mathbf\{S\}\_\{s\}=\\operatorname\{softmax\}\\\!\\left\(\\operatorname\{GNN\}\_\{s\}\(\\mathbf\{F\}\_\{s\},\\mathbf\{A\}\_\{s\}\)\\right\)\\in\\mathbb\{R\}^\{N\_\{s\}\\times C\_\{s\}\},whereCsC\_\{s\}denotes the number of clusters for subgraphss\. Cluster\-level representations and adjacencies are then constructed as
𝐙s\\displaystyle\\mathbf\{Z\}\_\{s\}=𝐒s⊤𝐅s∈ℝCs×d,\\displaystyle=\\mathbf\{S\}\_\{s\}^\{\\top\}\\mathbf\{F\}\_\{s\}\\in\\mathbb\{R\}^\{C\_\{s\}\\times d\},𝐀~s\\displaystyle\\widetilde\{\\mathbf\{A\}\}\_\{s\}=𝐒s⊤𝐀s𝐒s∈ℝCs×Cs\.\\displaystyle=\\mathbf\{S\}\_\{s\}^\{\\top\}\\mathbf\{A\}\_\{s\}\\mathbf\{S\}\_\{s\}\\in\\mathbb\{R\}^\{C\_\{s\}\\times C\_\{s\}\}\.After further updates on the cluster graphs, cluster\-level representations𝐙lig\\mathbf\{Z\}\_\{\\text\{lig\}\}and𝐙pro\\mathbf\{Z\}\_\{\\text\{pro\}\}are obtained\.
At the cluster level, bidirectional cross\-attention is employed to model static structural interactions between the ligand and the protein\. The ligand\-to\-protein interaction is computed as
𝐙~lig→pro\\displaystyle\\widetilde\{\\mathbf\{Z\}\}\_\{\\text\{lig\}\\to\\text\{pro\}\}=softmax\(𝐙lig𝐖\(l2p\)\(𝐙pro𝐖\(l2p\)\)⊤d\)\\displaystyle=\\operatorname\{softmax\}\\\!\\left\(\\frac\{\\mathbf\{Z\}\_\{\\text\{lig\}\}\\mathbf\{W\}^\{\(\\text\{l2p\}\)\}\\left\(\\mathbf\{Z\}\_\{\\text\{pro\}\}\\mathbf\{W\}^\{\(\\text\{l2p\}\)\}\\right\)^\{\\top\}\}\{\\sqrt\{d\}\}\\right\)⋅𝐙pro𝐖\(l2p\)∈ℝClig×d,\\displaystyle\\quad\\cdot\\mathbf\{Z\}\_\{\\text\{pro\}\}\\mathbf\{W\}^\{\(\\text\{l2p\}\)\}\\in\\mathbb\{R\}^\{C\_\{\\text\{lig\}\}\\times d\},which is aggregated into a graph\-level representation with a residual connection
𝐠static\(lig\)=MLPout\(l2p\)\(∑i=1Clig𝐙~lig→pro,i\)\+∑i=1Clig𝐙lig,i\.\\mathbf\{g\}\_\{\\text\{static\}\}^\{\(\\text\{lig\}\)\}=\\operatorname\{MLP\}\_\{\\text\{out\}\}^\{\(\\text\{l2p\}\)\}\\\!\\left\(\\sum\_\{i=1\}^\{C\_\{\\text\{lig\}\}\}\\widetilde\{\\mathbf\{Z\}\}\_\{\\text\{lig\}\\to\\text\{pro\},i\}\\right\)\+\\sum\_\{i=1\}^\{C\_\{\\text\{lig\}\}\}\\mathbf\{Z\}\_\{\\text\{lig\},i\}\.Symmetrically, the protein\-to\-ligand interaction is given by
𝐙~pro→lig\\displaystyle\\widetilde\{\\mathbf\{Z\}\}\_\{\\text\{pro\}\\to\\text\{lig\}\}=softmax\(𝐙pro𝐖\(p2l\)\(𝐙lig𝐖\(p2l\)\)⊤d\)\\displaystyle=\\operatorname\{softmax\}\\\!\\left\(\\frac\{\\mathbf\{Z\}\_\{\\text\{pro\}\}\\mathbf\{W\}^\{\(\\text\{p2l\}\)\}\\left\(\\mathbf\{Z\}\_\{\\text\{lig\}\}\\mathbf\{W\}^\{\(\\text\{p2l\}\)\}\\right\)^\{\\top\}\}\{\\sqrt\{d\}\}\\right\)⋅𝐙lig𝐖\(p2l\)∈ℝCpro×d,\\displaystyle\\quad\\cdot\\mathbf\{Z\}\_\{\\text\{lig\}\}\\mathbf\{W\}^\{\(\\text\{p2l\}\)\}\\in\\mathbb\{R\}^\{C\_\{\\text\{pro\}\}\\times d\},and aggregated as
𝐠static\(pro\)=MLPout\(p2l\)\(∑i=1Cpro𝐙~pro→lig,i\)\+∑i=1Cpro𝐙pro,i\.\\mathbf\{g\}\_\{\\text\{static\}\}^\{\(\\text\{pro\}\)\}=\\operatorname\{MLP\}\_\{\\text\{out\}\}^\{\(\\text\{p2l\}\)\}\\\!\\left\(\\sum\_\{i=1\}^\{C\_\{\\text\{pro\}\}\}\\widetilde\{\\mathbf\{Z\}\}\_\{\\text\{pro\}\\to\\text\{lig\},i\}\\right\)\+\\sum\_\{i=1\}^\{C\_\{\\text\{pro\}\}\}\\mathbf\{Z\}\_\{\\text\{pro\},i\}\.In this way, the bidirectional cross\-attention module produces the graph\-level static representations𝐠static\(lig\)\\mathbf\{g\}\_\{\\text\{static\}\}^\{\(\\text\{lig\}\)\}and𝐠static\(pro\)\\mathbf\{g\}\_\{\\text\{static\}\}^\{\(\\text\{pro\}\)\}in Eq\. \([11](https://arxiv.org/html/2606.14217#S3.E11)\)\.
## 4More Experimental Details and Results
Table S1:Performance comparison of CPES and baselines on the LBA 30% and LBA 60% splits\.Figure S1:Performance comparison under different intermolecular distance cutoffsdcutoffd\_\{\\text\{cutoff\}\}and protein representation configurations \(10 Å pocket versus whole protein\) on the 2013 core set, 2016 core set, and the 2019 holdout set\. \(a\) RMSE and \(b\) Pearson correlation coefficient are reported for each configuration\. Solid bars denote the 10 Å pocket representation, while hatched bars denote the whole protein representation\.### 4\.1Experimental Setup
#### 4\.1\.1Datasets Preparation
Our experiments are mainly conducted on the PDBbind dataset, which contains experimentally resolved 3D structures of protein\-ligand complexes along with their binding affinities commonly defined as−log\(Kd\)\-\\log\\\!\\left\(K\_\{\\text\{d\}\}\\right\)or−log\(Ki\)\-\\log\\\!\\left\(K\_\{\\text\{i\}\}\\right\), whereKdK\_\{\\text\{d\}\}andKiK\_\{\\text\{i\}\}correspond to the dissociation and inhibition constants, respectively\. The data preprocessing pipeline consists of the following steps\. First, we use PyMOL to select protein residues within a distance cutoffαdis\\alpha\_\{\\text\{dis\}\}from the ligand\. A residue is included if any of its atoms lies within the cutoff, and the entire residue is then retained\. Water molecules and hydrogen atoms are removed\. Specifically, for constructing the static structural interaction graphs, we setαdis=5Å\\alpha\_\{\\text\{dis\}\}=5\\,\\text\{\\AA \}\. For the graphs used to derive curvature\-informed descriptors of dynamic conformations, we use a larger cutoffαdis=10Å\\alpha\_\{\\text\{dis\}\}=10\\,\\text\{\\AA \}\. This choice avoids insufficient graph connectivity caused by overly small cutoffs, which may otherwise result in the absence of non\-zero eigenmodes in the dynamical spectrum\. Then, the ligand molecules are converted from MOL2 to PDB format using Open Babel and subsequently parsed using RDKit\. Samples that cannot be processed by RDKit are discarded\. Finally, the atomic coordinates, atomic features, and chemical bond information obtained from RDKit are used to construct graph representations\. The spectral descriptors of dynamic conformational curvature are then derived from the connectivity structure of the graphs\. In our experiments, the numbers of selected modes for the ligand, protein, and complex are set toKul=200K\_\{\\mathrm\{ul\}\}=200,Kup=200K\_\{\\mathrm\{up\}\}=200, andKuc=200K\_\{\\mathrm\{uc\}\}=200, respectively\. These settings retain a sufficient range of low\-frequency modes for describing dynamic conformational curvature while maintaining manageable computational complexity in the subsequent cross\-attention modeling\. For samples with fewer available non\-zero modes, the spectral sequence is padded using the last available non\-zero eigenvalue to preserve the predefined input length\.
#### 4\.1\.2Evaluation Metrics
To comprehensively evaluate the performance of the proposed model across different datasets, we adopt three widely used regression metrics, including Root Mean Square Error \(RMSE\), Pearson correlation coefficient, and Spearman rank correlation coefficient\. RMSE measures the absolute discrepancy between predicted and ground\-truth binding affinities and is particularly sensitive to large errors, thereby reflecting overall prediction accuracy\. Pearson evaluates the linear correlation between predictions and true values, capturing the consistency of numerical trends regardless of scale differences\. Spearman further assesses the monotonic relationship based on ranked values, providing a more robust measure under potential non\-linear dependencies\. Together, these metrics offer a comprehensive evaluation from the perspectives of absolute error, linear correlation, and ranking consistency\.
### 4\.2Dissimilar Protein Scenarios with Atom3D
Datasets\.To further evaluate model performance under dissimilar protein scenarios, experiments are additionally conducted on the Ligand Binding Affinity \(LBA\) benchmark from the ATOM3D project\[[53](https://arxiv.org/html/2606.14217#biba.bib87)\]\. The ATOM3D LBA benchmark adopts sequence\-identity\-based partitioning to reduce similarity between training and test proteins\. Following the official benchmark protocol, two evaluation settings are considered: LBA30% and LBA60%, corresponding to protein sequence identity thresholds of30%30\\%and60%60\\%respectively\. Under these settings, proteins in the test set share at most the specified sequence identity with proteins in the training set, providing a more rigorous evaluation of model robustness\.
Comparison with Baselines\.Table[S1](https://arxiv.org/html/2606.14217#S4.T1)compares the proposed model with representative interaction\-free methods \(e\.g\., TAPE\[[45](https://arxiv.org/html/2606.14217#biba.bib80)\]and ProtTrans\[[1](https://arxiv.org/html/2606.14217#biba.bib81)\]\), pre\-training\-based approaches \(e\.g\., Uni\-Mol\[[60](https://arxiv.org/html/2606.14217#biba.bib83)\], GeoSSL\[[36](https://arxiv.org/html/2606.14217#biba.bib85)\], and BindNet\[[16](https://arxiv.org/html/2606.14217#biba.bib86)\]\), and interaction\-based models \(e\.g\., ProNet\[[55](https://arxiv.org/html/2606.14217#biba.bib90)\], GET\[[31](https://arxiv.org/html/2606.14217#biba.bib92)\], and CheapNet\[[35](https://arxiv.org/html/2606.14217#biba.bib29)\]\) on the ATOM3D LBA benchmarks\. Overall, the proposed method achieves the best or highly competitive performance across both the LBA30% and LBA60% splits\. On the LBA30% split, the proposed model achieves the highest Pearson correlation of 0\.651 and Spearman correlation of 0\.641\. Although the RMSE on LBA30% is marginally higher than that of CheapNet and GET, the improved correlation metrics indicate better consistency in affinity ranking across structurally diverse proteins\. On the LBA60% split, the proposed model further achieves the best overall performance\. These results demonstrate strong robustness and cross\-protein generalization ability under dissimilar protein scenarios\.
### 4\.3Effect of Graph Construction Strategies
As defined in Eq\. \([8](https://arxiv.org/html/2606.14217#S3.E8)\), the intermolecular distance cutoffdcutoffd\_\{\\text\{cutoff\}\}is used in the construction of the bound complex graph for dynamic mode analysis, where it determines whether ligand and protein atoms are connected through inter\-molecular interaction edges according to their spatial distance\. Physically, this cutoff controls the range of protein\-ligand coupling considered when deriving curvature\-informed dynamic descriptors\. An overly small cutoff may miss relevant local inter\-molecular couplings, whereas an overly large cutoff may introduce redundant or noisy long\-range connections\. To investigate its influence, we evaluate the model under differentdcutoffd\_\{\\text\{cutoff\}\}values and protein representation configurations, as shown in Fig\.[S1](https://arxiv.org/html/2606.14217#S4.F1)\. The whole protein representation uses the complete protein structure provided in the dataset, while the pocket representation retains entire protein residues selected withinαdis=10Å\\alpha\_\{\\text\{dis\}\}=10\\,\\text\{\\AA \}of the ligand\. Overall,dcutoff=3Åd\_\{\\text\{cutoff\}\}=3\\,\\text\{\\AA \}achieves the best performance across the benchmark test sets, suggesting that an appropriate cutoff effectively captures key local protein\-ligand coupling patterns while avoiding excessive noisy connections\. In addition, using the10Å10\\,\\text\{\\AA \}binding pocket consistently outperforms or matches the whole protein representation\. This observation indicates that incorporating the entire protein does not necessarily improve affinity prediction\. These results demonstrate that focusing on the local binding environment provides more informative dynamic and structural patterns for the model\.
## References
## References
- \[1\]E\. A\., H\. M\., D\. C\., R\. G\., W\. Y\., J\. L\., G\. T\., F\. T\., A\. C\., S\. M\., B\. D\., and R\. B\.\(2022\)ProtTrans: Toward Understanding the Language of Life Through Self\-Supervised Learning\.IEEE Transactions on Pattern Analysis and Machine Intelligence44\(10\),pp\. 7112–7127\.External Links:[Document](https://dx.doi.org/10.1109/TPAMI.2021.3095381)Cited by:[§4\.2](https://arxiv.org/html/2606.14217#S4.SS2a.p2.1),[Table S1](https://arxiv.org/html/2606.14217#S4.T1.6.6.11.5.1)\.
- \[2\]R\. Abel, L\. Wang, E\. D\. Harder, B\. J\. Berne, and R\. A\. Friesner\(2017\)Advancing Drug Discovery through Enhanced Free Energy Calculations\.Accounts of Chemical Research50\(7\),pp\. 1625–1632\.External Links:[Document](https://dx.doi.org/10.1021/acs.accounts.7b00083)Cited by:[§1\.1](https://arxiv.org/html/2606.14217#S1.SS1.p1.1)\.
- \[3\]A\. R\. Atilgan, S\. R\. Durell, R\. L\. Jernigan, M\. C\. Demirel, O\. Keskin, and I\. Bahar\(2001\)Anisotropy of Fluctuation Dynamics of Proteins with an Elastic Network Model\.Biophysical Journal80\(1\),pp\. 505–515\.External Links:[Document](https://dx.doi.org/https%3A//doi.org/10.1016/S0006-3495%2801%2976033-X)Cited by:[§1\.3](https://arxiv.org/html/2606.14217#S1.SS3.p2.1)\.
- \[4\]I\. Bahar, T\. R\. Lezon, A\. Bakan, and I\. H\. Shrivastava\(2010\)Normal Mode Analysis of Biomolecular Structures: Functional Mechanisms of Membrane Proteins\.Chemical Reviews110\(3\),pp\. 1463–1497\.External Links:[Document](https://dx.doi.org/10.1021/cr900095e)Cited by:[§1\.3](https://arxiv.org/html/2606.14217#S1.SS3.p2.1)\.
- \[5\]I\. Bahar and A\. J\. Rader\(2005\)Coarse\-grained normal mode analysis in structural biology\.Current Opinion in Structural Biology15\(5\),pp\. 586–592\.External Links:[Document](https://dx.doi.org/https%3A//doi.org/10.1016/j.sbi.2005.08.007)Cited by:[§1\.3](https://arxiv.org/html/2606.14217#S1.SS3.p2.1)\.
- \[6\]T\. Bepler and B\. Berger\(2019\)Learning protein sequence embeddings using information from structure\.InInternational Conference on Learning Representations,Cited by:[Table S1](https://arxiv.org/html/2606.14217#S4.T1.6.6.9.3.1)\.
- \[7\]J\. Bruna\(2013\)Spectral networks and locally connected networks on graphs\.arXiv preprint arXiv:1312\.6203\.Cited by:[§1\.2](https://arxiv.org/html/2606.14217#S1.SS2.p1.1)\.
- \[8\]S\. Cain, A\. Risheh, and N\. Forouzesh\(2022\)A Physics\-Guided Neural Network for Predicting Protein–Ligand Binding Free Energy: From Host–Guest Systems to the PDBbind Database\.Biomolecules12\(7\),pp\. 919\.External Links:[Document](https://dx.doi.org/10.3390/biom12070919%20ER%20-)Cited by:[§1\.3](https://arxiv.org/html/2606.14217#S1.SS3.p3.1)\.
- \[9\]Y\. Chiang, W\. Hui, and S\. Chang\(2022\)Encoding protein dynamic information in graph representation for functional residue identification\.Cell Reports Physical Science3\(7\),pp\. 100975\.External Links:[Document](https://dx.doi.org/https%3A//doi.org/10.1016/j.xcrp.2022.100975)Cited by:[§1\.3](https://arxiv.org/html/2606.14217#S1.SS3.p3.1)\.
- \[10\]M\. Defferrard, X\. Bresson, and P\. Vandergheynst\(2016\)Convolutional neural networks on graphs with fast localized spectral filtering\.InAdvances in Neural Information Processing Systems,Cited by:[§1\.2](https://arxiv.org/html/2606.14217#S1.SS2.p1.1)\.
- \[11\]L\. Dong, X\. Qu, Y\. Zhao, and B\. Wang\(2021\)Prediction of Binding Free Energy of Protein–Ligand Complexes with a Hybrid Molecular Mechanics/Generalized Born Surface Area and Machine Learning Method\.ACS Omega6\(48\),pp\. 32938–32947\.External Links:[Document](https://dx.doi.org/10.1021/acsomega.1c04996)Cited by:[§1\.3](https://arxiv.org/html/2606.14217#S1.SS3.p1.1)\.
- \[12\]R\. O\. Dror, R\. M\. Dirks, J\. P\. Grossman, H\. Xu, and D\. E\. Shaw\(2012\)Biomolecular Simulation: A Computational Microscope for Molecular Biology\.Annual Review of Biophysics41,pp\. 429–452\.External Links:[Document](https://dx.doi.org/https%3A//doi.org/10.1146/annurev-biophys-042910-155245)Cited by:[§1\.1](https://arxiv.org/html/2606.14217#S1.SS1.p1.1)\.
- \[13\]W\. Du, Y\. Du, L\. Wang, D\. Feng, G\. Wang, S\. Ji, C\. Gomes, and Z\. Ma\(2023\)A new perspective on building efficient and expressive 3D equivariant graph neural networks\.InAdvances in Neural Information Processing Systems,pp\. 66647–66674\.Cited by:[Table S1](https://arxiv.org/html/2606.14217#S4.T1.6.6.21.15.1)\.
- \[14\]S\. Eismann, R\. J\. L\. Townshend, N\. Thomas, M\. Jagota, B\. Jing, and R\. O\. Dror\(2021\)Hierarchical, rotation\-equivariant neural networks to select structural models of protein complexes\.Proteins: Structure, Function, and Bioinformatics89\(5\),pp\. 493–501\.External Links:[Document](https://dx.doi.org/https%3A//doi.org/10.1002/prot.26033)Cited by:[§1\.2](https://arxiv.org/html/2606.14217#S1.SS2.p1.1)\.
- \[15\]E\. Eyal, L\. Yang, and I\. Bahar\(2006\)Anisotropic network model: systematic evaluation and a new web interface\.Bioinformatics22\(21\),pp\. 2619–2627\.External Links:[Document](https://dx.doi.org/10.1093/bioinformatics/btl448)Cited by:[§1\.3](https://arxiv.org/html/2606.14217#S1.SS3.p2.1)\.
- \[16\]S\. Feng, M\. Li, Y\. Jia, W\. Ma, and Y\. Lan\(2024\)Protein\-ligand binding representation learning from fine\-grained interactions\.InThe Twelfth International Conference on Learning Representations,Cited by:[§4\.2](https://arxiv.org/html/2606.14217#S4.SS2a.p2.1),[Table S1](https://arxiv.org/html/2606.14217#S4.T1.6.6.16.10.1)\.
- \[17\]M\. Finzi, S\. Stanton, P\. Izmailov, and A\. G\. Wilson\(2020\)Generalizing Convolutional Neural Networks for Equivariance to Lie Groups on Arbitrary Continuous Data\.InProceedings of Machine Learning Research,pp\. 3165–3176\.Cited by:[§1\.2](https://arxiv.org/html/2606.14217#S1.SS2.p1.1)\.
- \[18\]F\. Fuchs, D\. Worrall, V\. Fischer, and M\. Welling\(2020\)SE\(3\)\-transformers: 3d roto\-translation equivariant attention networks\.InAdvances in neural information processing systems,pp\. 1970–1981\.Cited by:[§1\.2](https://arxiv.org/html/2606.14217#S1.SS2.p1.1)\.
- \[19\]P\. Gainza, F\. Sverrisson, F\. Monti, E\. Rodolà, D\. Boscaini, M\. M\. Bronstein, and B\. E\. Correia\(2020\)Deciphering interaction fingerprints from protein molecular surfaces using geometric deep learning\.Nature Methods17\(2\),pp\. 184–192\.External Links:[Document](https://dx.doi.org/10.1038/s41592-019-0666-6)Cited by:[§1\.2](https://arxiv.org/html/2606.14217#S1.SS2.p2.1)\.
- \[20\]B\. Gao, Y\. Jia, Y\. Mo, Y\. Ni, W\. Ma, Z\. Ma, and Y\. Lan\(2024\)Self\-supervised Pocket Pretraining via Protein Fragment\-Surroundings Alignment\.InThe Twelfth International Conference on Learning Representations,Cited by:[Table S1](https://arxiv.org/html/2606.14217#S4.T1.6.6.14.8.1)\.
- \[21\]J\. Gilmer, S\. S\. Schoenholz, P\. F\. Riley, O\. Vinyals, and G\. E\. Dahl\(2017\)Neural Message Passing for Quantum Chemistry\.InProceedings of the 34th International Conference on Machine Learning,pp\. 1263–1272\.Cited by:[§1\.2](https://arxiv.org/html/2606.14217#S1.SS2.p2.1)\.
- \[22\]A\. Goyal and Y\. Bengio\(2022\)Inductive biases for deep learning of higher\-level cognition\.Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences478\(2266\),pp\. 20210068\.External Links:[Document](https://dx.doi.org/10.1098/rspa.2021.0068)Cited by:[§1\.3](https://arxiv.org/html/2606.14217#S1.SS3.p1.1)\.
- \[23\]T\. Harren, T\. Gutermuth, C\. Grebner, G\. Hessler, and M\. Rarey\(2024\)Modern machine\-learning for binding affinity estimation of protein–ligand complexes: Progress, opportunities, and challenges\.WIREs Computational Molecular Science14\(3\),pp\. e1716\.External Links:[Document](https://dx.doi.org/https%3A//doi.org/10.1002/wcms.1716)Cited by:[§1\.3](https://arxiv.org/html/2606.14217#S1.SS3.p1.1)\.
- \[24\]P\. Hermosilla, M\. S\. A\. Fer, M\. Lang, G\. Fackelmann, P\. V\. A\. Zquez, B\. Kozlikova, M\. Krone, T\. Ritschel, and T\. Ropinski\(2021\)Intrinsic\-Extrinsic Convolution and Pooling for Learning on 3D Protein Structures\.InInternational Conference on Learning Representations,Cited by:[Table S1](https://arxiv.org/html/2606.14217#S4.T1.6.6.18.12.1)\.
- \[25\]B\. Isin, K\. C\. Tirupula, Z\. N\. Oltvai, J\. Klein\-Seetharaman, and I\. Bahar\(2012\)Identification of motions in membrane proteins by elastic network models and their experimental validation\.Methods Mol Biol914,pp\. 285–317\.External Links:[Document](https://dx.doi.org/10.1007/978-1-62703-023-6%5F17)Cited by:[§1\.3](https://arxiv.org/html/2606.14217#S1.SS3.p2.1)\.
- \[26\]W\. Jespers, J\. Åqvist, and H\. Gutiérrez\-de\-Terán\(2021\)Free Energy Calculations for Protein–Ligand Binding Prediction\.InProtein\-Ligand Interactions and Drug Design,F\. Ballante \(Ed\.\),Protein\-Ligand Interactions and Drug Design\.Cited by:[§1\.1](https://arxiv.org/html/2606.14217#S1.SS1.p1.1)\.
- \[27\]M\. Karplus and J\. A\. McCammon\(2002\)Molecular dynamics simulations of biomolecules\.Nature Structural Biology9\(9\),pp\. 646–652\.External Links:[Document](https://dx.doi.org/10.1038/nsb0902-646)Cited by:[§1\.1](https://arxiv.org/html/2606.14217#S1.SS1.p1.1)\.
- \[28\]M\. H\. Kim, B\. H\. Lee, and M\. K\. Kim\(2015\)Robust elastic network model: A general modeling for precise understanding of protein dynamics\.Journal of Structural Biology190\(3\),pp\. 338–347\.External Links:[Document](https://dx.doi.org/https%3A//doi.org/10.1016/j.jsb.2015.04.007)Cited by:[§1\.3](https://arxiv.org/html/2606.14217#S1.SS3.p2.1)\.
- \[29\]E\. King, E\. Aitchison, H\. Li, and R\. Luo\(2021\)Recent Developments in Free Energy Calculations for Drug Discovery\.Front Mol Biosci8,pp\. 712085\.External Links:[Document](https://dx.doi.org/10.3389/fmolb.2021.712085)Cited by:[§1\.1](https://arxiv.org/html/2606.14217#S1.SS1.p1.1)\.
- \[30\]T\. N\. Kipf\(2016\)Semi\-supervised classification with graph convolutional networks\.arXiv preprint arXiv:1609\.02907\.Cited by:[§1\.2](https://arxiv.org/html/2606.14217#S1.SS2.p1.1)\.
- \[31\]X\. Kong, W\. Huang, and Y\. Liu\(2024\)Generalist Equivariant Transformer Towards 3D Molecular Interaction Learning\.InForty\-first International Conference on Machine Learning,Cited by:[§4\.2](https://arxiv.org/html/2606.14217#S4.SS2a.p2.1),[Table S1](https://arxiv.org/html/2606.14217#S4.T1.6.6.22.16.1)\.
- \[32\]Y\. LeCun, Y\. Bengio, and G\. Hinton\(2015\)Deep learning\.Nature521\(7553\),pp\. 436–444\.External Links:[Document](https://dx.doi.org/10.1038/nature14539)Cited by:[§1\.3](https://arxiv.org/html/2606.14217#S1.SS3.p1.1)\.
- \[33\]H\. Lee, P\. S\. Emani, and M\. B\. Gerstein\(2024\)Improved Prediction of Ligand–Protein Binding Affinities by Meta\-modeling\.Journal of Chemical Information and Modeling64\(23\),pp\. 8684–8704\.External Links:[Document](https://dx.doi.org/10.1021/acs.jcim.4c01116)Cited by:[§1\.3](https://arxiv.org/html/2606.14217#S1.SS3.p1.1)\.
- \[34\]Y\. Li, C\. Gu, T\. Dullien, O\. Vinyals, and P\. Kohli\(2019\)Graph Matching Networks for Learning the Similarity of Graph Structured Objects\.InProceedings of the 36th International Conference on Machine Learning,pp\. 3835–3845\.Cited by:[§1\.2](https://arxiv.org/html/2606.14217#S1.SS2.p1.1)\.
- \[35\]H\. Lim, S\. Kim, and S\. Lee\(2025\)CheapNet: Cross\-attention on Hierarchical representations for Efficient protein\-ligand binding Affinity Prediction\.InThe Thirteenth International Conference on Learning Representations,Cited by:[§1\.2](https://arxiv.org/html/2606.14217#S1.SS2.p2.1),[§4\.2](https://arxiv.org/html/2606.14217#S4.SS2a.p2.1),[Table S1](https://arxiv.org/html/2606.14217#S4.T1.6.6.23.17.1)\.
- \[36\]S\. Liu, H\. Guo, and J\. Tang\(2023\)Molecular Geometry Pretraining with SE\(3\)\-Invariant Denoising Distance Matching\.InThe Eleventh International Conference on Learning Representations,Cited by:[§4\.2](https://arxiv.org/html/2606.14217#S4.SS2a.p2.1),[Table S1](https://arxiv.org/html/2606.14217#S4.T1.6.6.15.9.1)\.
- \[37\]J\. R\. López\-Blanco, J\. I\. Aliaga, E\. S\. Quintana\-Ortí, and P\. Chacón\(2014\)iMODS: internal coordinates normal mode analysis server\.Nucleic Acids Research42\(W1\),pp\. W271–W276\.External Links:[Document](https://dx.doi.org/10.1093/nar/gku339)Cited by:[§1\.3](https://arxiv.org/html/2606.14217#S1.SS3.p2.1)\.
- \[38\]M\. B\. M\., B\. J\., L\. Y\., S\. A\., and V\. P\.\(2017\)Geometric Deep Learning: Going beyond Euclidean data\.IEEE Signal Processing Magazine34\(4\),pp\. 18–42\.External Links:[Document](https://dx.doi.org/10.1109/MSP.2017.2693418)Cited by:[§1\.2](https://arxiv.org/html/2606.14217#S1.SS2.p1.1)\.
- \[39\]J\. Ma\(2005\)Usefulness and Limitations of Normal Mode Analysis in Modeling Dynamics of Biomolecular Complexes\.Structure13\(3\),pp\. 373–380\.External Links:[Document](https://dx.doi.org/https%3A//doi.org/10.1016/j.str.2005.02.002)Cited by:[§1\.3](https://arxiv.org/html/2606.14217#S1.SS3.p2.1)\.
- \[40\]Y\. Min, Y\. Wei, P\. Wang, X\. Wang, H\. Li, N\. Wu, S\. Bauer, S\. Zheng, Y\. Shi, Y\. Wang, J\. Wu, D\. Zhao, and J\. Zeng\(2024\)From Static to Dynamic Structures: Improving Binding Affinity Prediction with Graph\-Based Deep Learning\.Advanced Science11\(40\),pp\. 2405404\.External Links:[Document](https://dx.doi.org/https%3A//doi.org/10.1002/advs.202405404)Cited by:[§1\.3](https://arxiv.org/html/2606.14217#S1.SS3.p3.1)\.
- \[41\]D\. L\. Mobley and M\. K\. Gilson\(2017\)Predicting Binding Free Energies: Frontiers and Benchmarks\.Annual Review of Biophysics46,pp\. 531–558\.External Links:[Document](https://dx.doi.org/https%3A//doi.org/10.1146/annurev-biophys-070816-033654)Cited by:[§1\.1](https://arxiv.org/html/2606.14217#S1.SS1.p1.1)\.
- \[42\]H\. Öztürk, A\. Özgür, and E\. Ozkirimli\(2018\)DeepDTA: deep drug–target binding affinity prediction\.Bioinformatics34\(17\),pp\. i821–i829\.External Links:[Document](https://dx.doi.org/10.1093/bioinformatics/bty593)Cited by:[Table S1](https://arxiv.org/html/2606.14217#S4.T1.6.6.8.2.2)\.
- \[43\]R\. G\. Parr\(1989\)Density functional theory of atoms and molecules\.Springer\.Cited by:[§1\.1](https://arxiv.org/html/2606.14217#S1.SS1.p1.1)\.
- \[44\]N\. Rahaman, A\. Baratin, D\. Arpit, F\. Draxler, M\. Lin, F\. Hamprecht, Y\. Bengio, and A\. Courville\(2019\)On the Spectral Bias of Neural Networks\.InProceedings of the 36th International Conference on Machine Learning,pp\. 5301–5310\.Cited by:[§1\.3](https://arxiv.org/html/2606.14217#S1.SS3.p1.1)\.
- \[45\]R\. Rao, N\. Bhattacharya, N\. Thomas, Y\. Duan, P\. Chen, J\. Canny, P\. Abbeel, and Y\. Song\(2019\)Evaluating Protein Transfer Learning with TAPE\.InAdvances in Neural Information Processing Systems,Cited by:[§4\.2](https://arxiv.org/html/2606.14217#S4.SS2a.p2.1),[Table S1](https://arxiv.org/html/2606.14217#S4.T1.6.6.10.4.1)\.
- \[46\]M\. H\. S\. Segler, T\. Kogej, C\. Tyrchan, and M\. P\. Waller\(2018\)Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks\.ACS Central Science4\(1\),pp\. 120–131\.External Links:[Document](https://dx.doi.org/10.1021/acscentsci.7b00512)Cited by:[§1\.2](https://arxiv.org/html/2606.14217#S1.SS2.p2.1)\.
- \[47\]H\. M\. Senn and W\. Thiel\(2009\)QM/MM Methods for Biomolecular Systems\.Angewandte Chemie International Edition48\(7\),pp\. 1198–1229\.External Links:[Document](https://dx.doi.org/https%3A//doi.org/10.1002/anie.200802019)Cited by:[§1\.1](https://arxiv.org/html/2606.14217#S1.SS1.p1.1)\.
- \[48\]V\. R\. Somnath, C\. Bunne, and A\. Krause\(2021\)Multi\-Scale Representation Learning on Proteins\.InAdvances in Neural Information Processing Systems,pp\. 25244–25255\.Cited by:[Table S1](https://arxiv.org/html/2606.14217#S4.T1.6.6.19.13.1)\.
- \[49\]C\. Taco and W\. Max\(2016\)Group Equivariant Convolutional Networks\.InProceedings of The 33rd International Conference on Machine Learning,pp\. 2990–2999\.Cited by:[§1\.2](https://arxiv.org/html/2606.14217#S1.SS2.p1.1)\.
- \[50\]F\. Tama and Y\. H\. Sanejouand\(2001\)Conformational change of proteins arising from normal mode calculations\.Protein Engineering14\(1\),pp\. 1–6\.External Links:[Document](https://dx.doi.org/10.1093/protein/14.1.1)Cited by:[§1\.3](https://arxiv.org/html/2606.14217#S1.SS3.p2.1)\.
- \[51\]N\. Thomas, T\. Smidt, S\. Kearnes, L\. Yang, L\. Li, K\. Kohlhoff, and P\. Riley\(2018\)Tensor field networks: Rotation\-and translation\-equivariant neural networks for 3d point clouds\.arXiv preprint arXiv:1802\.08219\.Cited by:[§1\.2](https://arxiv.org/html/2606.14217#S1.SS2.p1.1)\.
- \[52\]R\. J\. L\. Townshend, S\. Eismann, A\. M\. Watkins, R\. Rangan, M\. Karelina, R\. Das, and R\. O\. Dror\(2021\)Geometric deep learning of RNA structure\.Science373\(6558\),pp\. 1047–1051\.External Links:[Document](https://dx.doi.org/10.1126/science.abe5650)Cited by:[§1\.2](https://arxiv.org/html/2606.14217#S1.SS2.p2.1)\.
- \[53\]R\. J\. L\. Townshend, M\. V\. O\. Gele, P\. A\. Suriana, A\. Derry, A\. Powers, Y\. Laloudakis, S\. Balachandar, B\. Jing, B\. M\. Anderson, S\. Eismann, R\. Kondor, R\. Altman, and R\. O\. Dror\(2021\)ATOM3D: Tasks on Molecules in Three Dimensions\.InThirty\-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track \(Round 1\),Cited by:[§4\.2](https://arxiv.org/html/2606.14217#S4.SS2a.p1.2),[Table S1](https://arxiv.org/html/2606.14217#S4.T1.6.6.17.11.2)\.
- \[54\]H\. Wako and S\. Endo\(2017\)Normal mode analysis as a method to derive protein dynamics information from the Protein Data Bank\.Biophys Rev9\(6\),pp\. 877–893\.External Links:[Document](https://dx.doi.org/10.1007/s12551-017-0330-2)Cited by:[§1\.3](https://arxiv.org/html/2606.14217#S1.SS3.p2.1)\.
- \[55\]L\. Wang, H\. Liu, Y\. Liu, J\. Kurtin, and S\. Ji\(2023\)Learning Hierarchical Protein Representations via Complete 3D Graph Networks\.InThe Eleventh International Conference on Learning Representations,Cited by:[§4\.2](https://arxiv.org/html/2606.14217#S4.SS2a.p2.1),[Table S1](https://arxiv.org/html/2606.14217#S4.T1.6.6.20.14.1)\.
- \[56\]F\. Wu, L\. Wu, D\. Radev, J\. Xu, and S\. Z\. Li\(2023\)Integration of pre\-trained protein language models into geometric deep learning networks\.Communications Biology6\(1\),pp\. 876\.External Links:[Document](https://dx.doi.org/10.1038/s42003-023-05133-1)Cited by:[Table S1](https://arxiv.org/html/2606.14217#S4.T1.6.6.12.6.2)\.
- \[57\]K\. Xu, W\. Hu, J\. Leskovec, and S\. Jegelka\(2019\)How powerful are graph neural networks?\.InInternational Conference on Learning Representations,Cited by:[§1\.2](https://arxiv.org/html/2606.14217#S1.SS2.p1.1)\.
- \[58\]Z\. Yang, W\. Zhong, Q\. Lv, T\. Dong, and C\. Yu\-Chian Chen\(2023\)Geometric Interaction Graph Neural Network for Predicting Protein–Ligand Binding Affinities from 3D Structures \(GIGN\)\.The Journal of Physical Chemistry Letters14\(8\),pp\. 2020–2033\.External Links:[Document](https://dx.doi.org/10.1021/acs.jpclett.2c03906)Cited by:[§1\.2](https://arxiv.org/html/2606.14217#S1.SS2.p2.1)\.
- \[59\]D\. M\. York\(2023\)Modern Alchemical Free Energy Methods for Drug Discovery Explained\.ACS Phys Chem Au3\(6\),pp\. 478–491\.External Links:[Document](https://dx.doi.org/10.1021/acsphyschemau.3c00033)Cited by:[§1\.1](https://arxiv.org/html/2606.14217#S1.SS1.p1.1)\.
- \[60\]G\. Zhou, Z\. Gao, Q\. Ding, H\. Zheng, H\. Xu, Z\. Wei, L\. Zhang, and G\. Ke\(2023\)Uni\-Mol: A Universal 3D Molecular Representation Learning Framework\.InThe Eleventh International Conference on Learning Representations,Cited by:[§4\.2](https://arxiv.org/html/2606.14217#S4.SS2a.p2.1),[Table S1](https://arxiv.org/html/2606.14217#S4.T1.6.6.13.7.1)\.Similar Articles
Curvature-Guided Geometric Representation for Protein-Ligand Binding Affinity Prediction
This paper proposes RicciBind, a geometric representation framework that integrates Ricci curvature and optimal transport for protein-ligand binding affinity prediction, demonstrating superior accuracy and interpretability across benchmarks.
Deep Learning for Protein Complex Prediction and Design
This PhD thesis introduces deep learning methods for protein complex prediction and design, including GLINTER for contact prediction, ESMPair for homolog pairing, and RedNet for binder design.
A Large-Scale Dataset and Benchmark: Do Protein-Ligand Models Learn Binding Sites or Just Binding Likelihood?
Introduces InteractBind, a large-scale dataset and benchmark for fine-grained evaluation of protein-ligand models, focusing on binding-site localization and non-covalent interaction prediction. Evaluates eight existing models and finds limited binding-site localization despite strong binary binding prediction.
Better Protein Function Prediction by Modeling Survivorship Bias
This paper introduces Evo-PU, a positive-unlabeled learning framework that models survivorship bias in protein sequence data by leveraging evolutionary mutation processes. The authors demonstrate that Evo-PU outperforms standard PU methods and protein language models in predicting protein functionality for influenza, RSV, and SARS-CoV-2.
Structural Interpretations of Protein Language Model Representations via Differentiable Graph Partitioning
This paper proposes SoftBlobGIN, a framework that enhances the interpretability of protein language model representations by projecting them onto contact graphs for structure-aware message passing. It demonstrates improved performance on enzyme classification and binding-site detection while providing auditable structural explanations.