Where Black-box Drug-Target Interaction Prediction Models Look: Cross-Method Explainability
Summary
A study presenting a cross-method explainability audit of the BridgeDPI drug-target interaction model, combining gradient-based attributions and occlusion to reveal modality dominance and artifacts, providing testable hypotheses for drug discovery.
View Cached Full Text
Cached at: 06/15/26, 09:12 AM
# Where Black-box Drug-Target Interaction Prediction Models Look: Cross-Method Explainability
Source: [https://arxiv.org/html/2606.14245](https://arxiv.org/html/2606.14245)
###### Abstract
Drug\-target interaction \(DTI\) and affinity \(DTA\) predictors increasingly achieve strong benchmark scores, yet their internal use of sequence, fingerprint, and graph features often remains opaque\. We present an interpretability audit of BridgeDPI architecture on three different datasets including Gao, Human, and C\.elegans\. This study combines gradient\-based attributions—integrated gradients, saliency, layer\-wise relevance propagation, SmoothGrad, and SmoothGrad\-IG—with feature\-wise occlusion ablation and strict intersection consensus across methods to reduce single\-explainer bias\. We summarize sensitivity and signed effects at raw inputs, at the bridge similarity scaffold, and through the graph convolution, including edge\-level sensitivities and targeted edge removals\. The results show that explainability is most informative when treated as model criticism: it reveals modality dominance, padding and special\-token artifacts, dataset\-dependent cooperative versus suppressive effects across layers, and chemistry\-consistent fragment and composition motifs where methods agree\. These analyses do not substitute for structural or experimental ground truth, yet they can provide testable hypotheses for downstream validation in computational drug discovery pipelines\. More broadly, applying modern XAI to contemporary DTI/DTA models is still an early pass over the rich structure implicit in trained weights and data—yet even this first layer of scrutiny already helps researchers relate predictions to drug\- and target\-side representations and to prioritize external validation\.
###### keywords:
Drug\-Target Interaction , Drug\-Target Affinity, Explainable Artificial Intelligence, BridgeDPI, Post\-hoc Explanations
††journal:Journal of ?\\affiliation
\[label1\]organization=Department of Mathematics and Computer Science, Amirkabir University of Technology, city=Tehran, country=Iran
## 1Introduction
Drug\-target interaction \(DTI\) and drug\-target affinity \(DTA\) prediction models have significantly advanced using deep learning techniques which excel at capturing complex relationships between drugs and proteins\. However, their inherent complexity often renders them ”black boxes”, making it challenging to interpret the reasoning behind specific predictions\. Explainable artificial intelligence \(XAI\) methods aim to address this challenge by providing transparent insights into model decision\-making processes\.
The discrepancy between the vast landscape of DTI/DTA models and the limited application of XAI on such models highlights that many models remain effectively “black boxes” and their decisions are often ignored in biological analysis\. Consequently, post\-hoc111Pos\-hoc methods refer to techniques which relies on auxiliary analytical procedures applied to trained models \(after training\) in order to infer how predictions are produced\.explainability methods become essential, providing a practical way to extract insights from existing models without requiring re\-design, and enabling researchers to trace predictions back to meaningful molecular features, residues, or substructures\.
Many drug\-target interaction and drug\-target affinity prediction models focus on improving performance metrics, often adding layers of complexity without considering how these complexities affect interpretability\. While such models may claim superior accuracy, high predictive performance alone does not guarantee reliability, generalizability, or applicability in real\-world settings\. Understanding why a model predicts that a particular drug interacts with a specific protein can reveal the underlying mechanisms of binding\. This interpretability not only validates predictions but also provides valuable insights for drug discovery, uncovering patterns and relationships that may be difficult to identify otherwise\[[29](https://arxiv.org/html/2606.14245#bib.bib1)\]\. In this way, DTI models can go beyond standard explanations to serve as hypothesis generators, guiding the design of novel therapeutics\.
To our knowledge, this research is one of the first studies in which multiple post\-hoc XAI techniques are applied to a black\-box \(drug\-protein interaction\) DPI prediction model, aiming to reveal how drugs, proteins, and bridge nodes contribute to model decisions\. Our study emphasizes interpretability rather than model development, highlighting biologically relevant patterns and key predictive features\. Our contributions are:
- 1\.XAI on multi\-layer BridgeDPI model: Applied Integrated Gradient \(IG\), Layer\-wise Relevance Propagation \(LRP\), saliency maps, SmoothGrad, and perturbation\-based methods to a multi\-layered architecture with high performance metrics\.
- 2\.Bridge node analysis: Explored the role of bridge nodes and edges in linking drug and protein features\.
- 3\.Dataset\-level aggregation: Highlighted top scored sections of different input layers in three different datasets using consensus of methods for robustness of results\.
## 2Related works
DTI/DTA studies have so far used four families of post\-hoc XAI methods that are compatible with current deep learning architectures which are gradient\-based attribution, attention visualizations, perturbation and surrogate techniques, and counterfactual edits\. Other XAI paradigms such as ad\-hoc and intristic methods may ultimately prove valuable for DTI but remain largely unexplored in this context\.
### 2\.1Explainable DTI/DTA models
Attention\-based visualization has become the dominant explainability method in DTI/DTA studies because it reveals important segments by visualizing heatmaps without further processing\. Early attention mechanisms, such as the two\-way attention between protein residues and drug atoms proposed by Gao et al\.\[[3](https://arxiv.org/html/2606.14245#bib.bib3)\], established the groundwork for explainable DTI prediction\. Subsequent work, including DeepAffinity\[[7](https://arxiv.org/html/2606.14245#bib.bib8)\], extended this idea with separate, marginalized, and joint attention modules, enabling interpretability across hierarchical biological contexts\. A further refinement appears in MONN\[[12](https://arxiv.org/html/2606.14245#bib.bib9)\], which integrates explicit pairwise atom\-residue interaction supervision using contact labels from over 13000 protein\-ligand complexes\. The ML\-DTI model\[[33](https://arxiv.org/html/2606.14245#bib.bib10)\]enhances interaction awareness through mutual\-learning layers connecting drug and protein encoders alongside multi\-head and position\-aware attention, generating atom\- and residue\-level saliency distributions\. AffinityVAE\[[30](https://arxiv.org/html/2606.14245#bib.bib14)\]uses a mutual attention module which computes ligand\-to\-protein and protein\-to\-ligand attention weights and an intrinsic Atom\-Residue Contact Map module that concatenates protein and ligand features into a two\-dimensional interaction map and uses a residual Convolutional network to predict the probability of contact between each residue\-atom pair\. A multi\-attention fusion strategy follows in MGMA\-DTI\[[11](https://arxiv.org/html/2606.14245#bib.bib20)\], showcasing attention aggregation at multiple representational scales\. Cross\- and self\-attention have likewise become central to interpretability\. ArkDTA\[[5](https://arxiv.org/html/2606.14245#bib.bib15)\]leverages ground\-truth non\-covalent interactions \(NCIs\) from protein\-ligand complexes to form atom\-residue interaction matrices regularized by an attention loss term\. ICAN\[[10](https://arxiv.org/html/2606.14245#bib.bib12)\]introduces a statistical interpretability assessment aligning attention heatmaps with experimentally confirmed binding sites, whereas AttentionSiteDTI\[[35](https://arxiv.org/html/2606.14245#bib.bib11)\]infers self\-attention scores over concatenated drug\-target embeddings\. A sequential attention cascade emerges in FragXsiteDTI\[[8](https://arxiv.org/html/2606.14245#bib.bib16)\], whose learnable latent query attends to pockets, performs self\-interaction refinement, and subsequently queries drug fragments\. Expanding on this paradigm, BindingSiteAugmentedDTA\[[37](https://arxiv.org/html/2606.14245#bib.bib13)\]adapts AttentionSiteDTI’s graph\-based pocket prediction to identify the most relevant binding pockets by reinterpreting protein\-ligand complexes as natural language processing\-style sequences\. Hierarchical attention frameworks have further advanced explainability: HiGraphDTI\[[14](https://arxiv.org/html/2606.14245#bib.bib17)\]employs a cross\-level hierarchical scheme linking protein segments to atom\-, motif\-, and global representations, while INGNN\-DTI\[[26](https://arxiv.org/html/2606.14245#bib.bib18)\]adopts a nested hierarchical graph architecture to learn molecular and protein embeddings across multiple scales\. Similarly, MultiGranDTI\[[6](https://arxiv.org/html/2606.14245#bib.bib19)\]exploits learnable assignment matrices that integrate multi\-granularity information in a unified graph framework\.
Complementing attention mechanisms, gradient\-based attribution approaches estimate feature importance by propagating output gradients back to salient input components, revealing model sensitivity to minimal perturbations\. Representative methods include simple saliency maps\[[24](https://arxiv.org/html/2606.14245#bib.bib21)\], Integrated Gradients\[[27](https://arxiv.org/html/2606.14245#bib.bib22)\], Gradient × Input and DeepLIFT\-style formulations\[[23](https://arxiv.org/html/2606.14245#bib.bib23)\], Grad\-CAM\[[21](https://arxiv.org/html/2606.14245#bib.bib24)\], Grad\-AAM\[[34](https://arxiv.org/html/2606.14245#bib.bib25)\], and GNNExplainer\[[36](https://arxiv.org/html/2606.14245#bib.bib26)\]\. Several adaptations tailor these techniques to biochemical data\. Monteiro et al\.\[[17](https://arxiv.org/html/2606.14245#bib.bib27)\]introduced a regression variant of Grad\-CAM that produces localization maps for one\-dimensional protein sequences and SMILES strings\. GSAML\-DTA\[[13](https://arxiv.org/html/2606.14245#bib.bib28)\]incorporates Grad\-AAM alongside intra\-graph attention weights extracted from protein\-encoder GAT layers, while MGraphDTA\[[34](https://arxiv.org/html/2606.14245#bib.bib25)\]applies Grad\-AAM both to its MGNN backbone and GAT baseline using native attention maps\. MvGraphDTA\[[38](https://arxiv.org/html/2606.14245#bib.bib29)\]computes input\-feature gradients with respect to prediction loss, ranking their magnitudes as per\-feature importance scores\. The Structure\-Aware GNN\[[22](https://arxiv.org/html/2606.14245#bib.bib30)\]unifies several feature\-attribution approaches—CAM, Grad\-CAM, Gradient × Input, and Integrated Gradients—under a single explainability interface, while GNNBlockDTI\[[2](https://arxiv.org/html/2606.14245#bib.bib31)\]introduces a streamlined interpretability layer for GNN blocks, inspired by Grad\-CAM visualization\.
Beyond gradients, perturbation\-based methods interpret models by analyzing predictive shifts resulting from controlled input modifications\[[39](https://arxiv.org/html/2606.14245#bib.bib32)\]\. SHAP\[[16](https://arxiv.org/html/2606.14245#bib.bib33)\], rooted in game\-theoretic attribution, extends this idea through local surrogate modeling and Monte\-Carlo perturbation sampling\. Its utility has been empirically demonstrated in several DTI frameworks: Ru et al\.\[[19](https://arxiv.org/html/2606.14245#bib.bib34)\]refined regression\-tree feature importance using SHAP values, and VGAN\-DTI\[[9](https://arxiv.org/html/2606.14245#bib.bib35)\]carried out a feature\-level interpretability analysis over engineered fingerprints and physicochemical descriptors using the same principle\. A distinct strand of research explores counterfactual explanations, which seek minimal perturbations capable of altering predicted affinity\[[31](https://arxiv.org/html/2606.14245#bib.bib36),[20](https://arxiv.org/html/2606.14245#bib.bib37)\]\.
Within this paradigm, MACDA\[[18](https://arxiv.org/html/2606.14245#bib.bib38)\]formulates DTI explanation as a multi\-agent reinforcement learning task, jointly modifying drug graphs and protein sequences through chemically valid operations\. By optimizing for minimal structural changes that cause maximal shifts in predicted binding affinity, MACDA produces counterfactual drug\-protein pairs that illuminate the causal reasoning underlying deep DTA models\.
### 2\.2BridgeDPI
Bridge\-DPI\[[32](https://arxiv.org/html/2606.14245#bib.bib7)\]is a novel drug\-protein interaction prediction framework that allows the model to be trained on larger datasets using a smart technique which involves having bridge nodes to allow the model train in batches and update the specific routes form drugs to protein pairs\. Nodes between drug\-drug, protein\-protein, and drug\-protein are not connected directly, therefore bridge nodes become the bottleneck and the main route for drugs and proteins to pass messages between them during the training process\.
Bridge nodes importance is mainly metioned, however, no ablation study \(i\.e\. edge weights between bridge nodes and drugs/proteins could be visualized to show the importance of bridge nodes\) is done for the bridge nodes to experiment on why they exist in the first place\. The potential of the bridge nodes in interpretability is unused and the paper goes no far to contribute to the explainability and interpretability of the model\.
## 3Materials and Methods
Multiple XAI methods are used on a black\-box model to investigate important parts of the inputs and analyze the structure of the main model as Fig\.[1](https://arxiv.org/html/2606.14245#S3.F1)shows\. BridgeDPI, a multi\-layer DPI black\-box model is selected as the main model to be analyzed due to its high numbers across different performance measures\. The analysis demonstrates the potential of the XAI tools used to reveal important regions of the model’s input\. Additionally, the modules within BridgeDPI such as bridge nodes are analyzed to understand their specific role in the decision\-making process\.
Figure 1:An overview of the main method used in this paper; meaning applying XAI methods on a DPI model to highlight the important parts of the inputs and analyzing various layers of the DPI model\.### 3\.1XAI methods
A suite of complementary explainability techniques is employed to show the internal reasoning of BridgeDPI model: Simple Saliency, Integrated Gradients \(IG\), SmoothGrad applied to both IG and Simple Saliency, and Layer\-wise Relevance Propagation \(LRP\)\. Each method provides a distinct perspective on how drug and protein features contribute to the final prediction\. Combined use of these explainability techniques can mitigate the drawbacks of each method and provide a clear view of models reasoning\.
Simple saliency maps\[[24](https://arxiv.org/html/2606.14245#bib.bib21)\]serve as the foundational gradient\-based approach for sensitivity analysis\. By computing the derivative of the output with respect to each input feature, they expose local dependencies between molecular substructures and predicted affinities\. Saliency maps highlight atoms, residues, and physicochemical descriptors exerting strong influence on model output\. Although inherently straightforward and computationally efficient, simple saliency maps can display noisy attribution patterns—hence motivating complementary smoothing and integration strategies\.
To increase numerical stability and theoretical consistency, Integrated Gradients \(IG\)\[[27](https://arxiv.org/html/2606.14245#bib.bib22)\]is adopted, which accumulate gradients along a linear path between a baseline input and the actual representation\. This integration accounts for the total effect of each feature on the prediction, satisfying sensitivity and implementation invariance axioms\. IG yields smoother and more reliable interpretations by identifying features contributing most strongly to binding affinity\.
SmoothGrad\[[25](https://arxiv.org/html/2606.14245#bib.bib39)\]can be applied on IG to mitigate the noise inherent in single\-pass gradient estimation\. SmoothGrad\-IG averages multiple IG computations over stochastically perturbed inputs, generating visually coherent attribution maps that preserve underlying causal relationships\. A parallel smoothing strategy can also be integrated with Simple Saliency\. SmoothGrad\-Saliency averages raw gradient outputs over perturbed samples, reducing background noise and amplifying stable regions\.
Layer\-wise Relevance Propagation \(LRP\)\[[1](https://arxiv.org/html/2606.14245#bib.bib40)\]can be incorporated to complement gradient\-based analyses\. Unlike differential sensitivity methods, LRP redistributes the model’s output relevance backward through the network layers, quantitatively decomposing predictions into neuron\-level contributions\. This allows tracing predicted affinity scores directly to input features while conserving total relevance across the architecture\.
### 3\.2Datasets
Three different datasets are used in this study including BindingDB, C\.elegans and Human datasets\.
BindingDB dataset: The affinity data of 2286319 drug\-protein pairs from corresponding research papers is collected, where 8536 proteins and 989383 drugs are included\[[4](https://arxiv.org/html/2606.14245#bib.bib2)\]222[https://www\.bindingdb\.org/bind/index\.jsp](https://www.bindingdb.org/bind/index.jsp)\. Most drug\-target pairs in BindingDB are reported asIC50IC50measurements, followed byKiK\_\{i\},EC50EC50, andKdK\_\{d\}values\. Gao version\[[3](https://arxiv.org/html/2606.14245#bib.bib3)\]333[https://github\.com/IBM/InterpretableDTIP](https://github.com/IBM/InterpretableDTIP)filters the data based onIC50IC50values and convert the affinity scores to binary interactions using thresholds of 100nMnMand 10000nMnM\.
C\.elegans and Human datasets: These datasets are constructed by combining a set of highly credible negative drug\-protein samples via an in silico screening method with the known positive samples\[[15](https://arxiv.org/html/2606.14245#bib.bib4)\]\. The balanced versions of these datasets are used here\[[28](https://arxiv.org/html/2606.14245#bib.bib5)\]444[https://github\.com/masashitsubaki/CPI\_prediction](https://github.com/masashitsubaki/CPI_prediction)\. The C\.elegans dataset has 7786 drug\-protein pairs\. Proteins in this dataset are from the species Caenorhabditis elegans, a small nematode worm widely used as a model organism in biology\. C\.elegans proteins include homologs of human proteins but may also include nematode\-specific proteins\. The Human dataset has 6728 drug\-protein pairs in total\. Proteins in this dataset are human proteins, often from a wide range of classes such as Enzymes \(e\.g\., kinases like EGFR, CDK2\), Receptors \(GPCRs, nuclear hormone receptors\), Ion channels\.
In general, Models should perform better on Human and C\.elegans datasets due to high overlap in training and test entities, whereas Gao’s dataset allows more generalization to unseen compounds and proteins\.
Table 1:Datasets Statistics\. For the Gao dataset in first part of the table, the values in parentheses are totals for all cataloged proteins/drugs \(some may not appear in any interaction\), while the values before the parentheses count only those that actually participate in interactions\.Features/DatasetsGaoC\.elegansHuman\# Unique Proteins in interactions813 \(842\)18762001\# Unique Drugs in interactions49752 \(171926\)17672726\# Total Amino Acids471022 \(489412\)10745211181424\# Total Drug Atoms5165804 \(1511956\)3291561577\# Unique Amino Acids212122\# Unique Drug Atoms21 \(26\)2862Average Protein sequence Length579\.36 \(581\.24\)572\.77590\.41Average Drug sequence Length30\.38 \(30\.04\)18\.6222\.58\# Train interactions50155 \(28240 pos / 21915 neg\)6228 \(3139 pos / 3089 neg\)5382 \(2700 pos / 2682 neg\)\# Valid interactions5607 \(2831 pos / 2776 neg\)N/AN/A\# Test interactions5508 \(2706 pos / 2802 neg\)1558 \(754 pos / 804 neg\)1346 \(664 pos / 682 neg\)% Protein warm setting90\.3 % valid / 91\.0 % test85\.5 % valid81\.8 % valid% Drug warm setting30\.1 % valid / 30\.3 % test74\.8 % valid60\.7 % validProtein duplicacy index75\.364\.153\.36Drug duplicacy index1\.234\.412\.47% Proteins with ¿=2 interaction83\.9%69\.0%61\.9%% Drugs with ¿=2 interaction15\.5%54\.9%38\.8%\# Max edges/protein1459504425\# Max edges/drug72399264Aliphatic \(A, V, L, I, M\)30\.52%30\.58%30\.78%Polar uncharged \(S, T, N, Q\)21\.04%21\.65%20\.85%Positive \(K, R, H\)13\.65%14\.10%13\.38%Aromatic \(F, Y, W\)8\.43%8\.22%8\.54%Sulfur \(C, M\)4\.68%4\.25%4\.54%Special \(G, P\)12\.4%10\.96%12\.58%Acidic residues \(D, E\)11\.64%12\.80%11\.70%P with any aromatic100%100%100%P with ¿=2 Cys96\.8%90\.4%94\.7%Any rings / Aromatic rings99\.7% / 98\.6%72\.0% / 61\.5%73\.8% / 52\.8%Heterocycles92\.9%50\.1%50\.3%Fused systems68\.1%37\.1%41\.2%Avg rings/molecule3\.981\.952\.33Avg aromatic rings2\.951\.281\.13Top Murcko scaffoldbenzene \(c1ccccc1, 1\.04%\)benzene \(c1ccccc1, 10\.07%\)benzene \(c1ccccc1, 6\.05%\)Figure 2:Amino Acid and Atom frequencies\. C\.elegans AA vocab \(L S A E V K G D I T R P N F Q Y M H C W U\), Gao AA vocab \(L S A E G V P R K T D I Q F N Y H M C W U\), Human AA vocab \(L S A G E V P K T R D I Q F N Y H M C W U X\), C\.elegans Atom vocab: \(C O N F S Cl P Br I Na Ca K Fe As Hg Mg B Co Zn Ce Pb U Ni Cu Sn Ag H W\), Gao atom vocab: \(C N O F S Cl Br P I Na B Si Se Re V Ru K Fe As Gd Sb\), Human atom vocab: \(C O N S F Cl P Na Br H I W K B Si Cu V Ca Li Zn Fe As Al Pb Hg Mo Ag Se Co Mg Sb Sr Mn Ni Ba Cr Zr Pt Tl Ta In Cd Ge Rb Ce Y Pd Au Sn Ho Ru Cf Ar Ga Be Po Bi Ir Ti Fr Tc He\)\. Bars which cannot be seen are not zero, they are very low\.Table[1](https://arxiv.org/html/2606.14245#S3.T1)summarizes the three interaction corpora used for training and evaluation\. Collectively, they span markedly different scales, reuse patterns, and small\-molecule complexity, which jointly constrain both predictive difficulty and how interpretability results should be read\.
The Gao collection couples a comparatively small protein inventory that participates in interactions \(≈800\\approx 800proteins in pairs; larger totals exist when catalog entries never observed in pairs are included\) with an exceptionally large compound inventory \(≈5×104\\approx 5\\times 10^\{4\}drugs in pairs versus≈1\.8−2\.7×103\\approx 1\.8\-2\.7\\times 10^\{3\}on C\.elegans and Human\)\. Consequently, aggregated amino\-acid and atom counts are dominated by Gao’s breadth of chemistry even though average protein lengths remain similar across datasets \(≈573−590\\approx 573\-590residues\)\. Average molecular size along the modeled drug representation is largest on Gao \(≈30\\approx 30atoms\) versus shorter typical graphs on C\.elegans \(≈19\\approx 19\) and Human \(≈23\\approx 23\), increasing representation complexity and sparsity risk for attribution maps on the screening\-style corpus\.
Training splits are approximately balanced between positives and negatives where reported; Gao additionally includes dedicated validation partitions\. Warmth differs sharply by modality on Gao: protein\-side warmth is high \(≈90%\\approx 90\\%validation/test\), whereas drug\-side warmth is low \(≈30%\\approx 30\\%\), so evaluation emphasizes novel compounds against recurrent targets—a stringent cold\-drug regime\. C\.elegans and Human instead show moderate protein warmth \(≈82−86%\\approx 82\-86\\%validation where listed\) paired with substantially higher drug warmth \(≈61−75%\\approx 61\-75\\%\), shifting failure modes toward protein novelty \(relative to Gao’s compound novelty\) rather than uniformly identical evaluation pressures across benchmarks\.
Protein duplicacy is extreme on Gao \(index≈75\\approx 75versus≈3−4\\approx 3\-4on the other corpora\): many interactions reuse the same proteins \(≈84%\\approx 84\\%with≥\\geq2 edges; maxima exceeding10310^\{3\}partners per protein\), whereas drug duplicacy remains modest \(≈1\.2\\approx 1\.2;≈16%\\approx 16\\%of drugs with≥\\geq2 interactions\)\. C\.elegans and Human exhibit more balanced reuse, including higher fractions of drugs appearing in multiple pairs\. For modeling and explanation, hub structure implies strong protein\-centric averaging on Gao—gradient or occlusion signals may reflect targets seen under many chemotypes—whereas organism subsets emphasize different reuse geometries when interpreting consensus features\.
Amino\-acid inventories are stable \(21\-22 residue types with similar coarse bins;≈31%\\approx 31\\%aliphatic across sets\)\. Thus differences in protein\-level explanations across benchmarks are unlikely to reflect incompatible alphabets; differences more plausibly reflect task geometry, warmth, and organism\-specific sequence composition\. Fig\.[2](https://arxiv.org/html/2606.14245#S3.F2)corroborates shared marginal residue frequencies dominated by abundant residues \(L, S, A, G, V, acidic/basic types also frequent\)\. These histograms characterize whole\-sequence corpus composition, so they should not be equated with binding\-site enrichment unless complemented by positional or structural analyses
Drug topology diverges strongly: Gao overwhelmingly contain rings and aromatic systems with abundant heterocycles/fused frameworks \(several rings per molecule on average\), whereas C\.elegans and Human show lower aromatic\-ring incidence and fewer rings per molecule—consistent with less uniformly drug\-like libraries and differing scaffold diversity \(benzene\-type Murcko cores rank first in each set but at dataset\-dependent prevalence\)\. Fig\.[2](https://arxiv.org/html/2606.14245#S3.F2)elemental distributions show C/O/N dominance as expected; halogens and heteroatoms appear at smaller marginal rates with dataset\-specific shifts \(interpret cautiously because preprocessing, protonation assumptions, and graph featurization strongly shape counted atom types—especially hydrogens and alkali ions when present\)\.
Cross\-dataset explanation comparisons should be read jointly with cold\-drug severity on Gao, hubbed proteins, mean molecular complexity, and topological priors: identical attribution machinery can emphasize different apparent pharmacophores simply because the underlying chemistry frequency tables differ\.
### 3\.3Training of the main model
As Table[2](https://arxiv.org/html/2606.14245#S3.T2)shows, the model demonstrates strong performance with high accuracy and excellent discriminative ability, indicating that it has apparently learned meaningful patterns in the drug\-target interaction data\. The balanced precision and recall suggest the model captures both positive and negative interactions effectively, while the low loss values indicate confident predictions\. The reasonable gap between training and validation performance shows good generalization without overfitting, meaning the model has learned generalizable features rather than memorizing the training data\. The high discriminative ability particularly indicates that the model has identified robust patterns that should manifest as strong, interpretable gradients when analyzing which protein and drug features contribute most to the interaction predictions\.
Although biological validation of explainability results \(e\.g\., comparison of model\-attributed residues or atoms with experimentally confirmed binding pockets\) would provide the most direct evidence of interpretability, such validation is not yet feasible at scale\. The commonly used datasets—BindingDB, Davis/KIBA, and C\.elegans/Human—rarely include residue\-level structural annotations or complete co\-crystal data\. Even when structural information is available in external databases such as PDB or PDBBind, only a small subset of targets overlap with those used in benchmark datasets, making systematic mapping unreliable\. Therefore, this study focuses on input\-level interpretability—identifying the regions of drugs and proteins that most strongly influence model predictions—while recognizing that deeper biological alignment remains an open challenge for the field\.
The model architecture was modified to replace the element\-wise multiplication operation between protein and drug components with concatenation to facilitate more interpretable gradient analysis\. This architectural change was motivated by the need to obtain cleaner, independent gradients for each modality during backpropagation\. While multiplication creates coupled gradients where changes in one component affect the gradient of the other, concatenation preserves independent gradient flow, allowing for more precise attribution analysis\. The transition from multiplication to concatenation in the final layer, while beneficial for independent gradient analysis of protein and drug components, inadvertently diminished the interpretability of bridge nodes by removing their role as mediators of protein\-drug interactions\.
In gradient methods, both input×\\timesgradient and raw gradient are used, but the results were very similar, therefore, raw gradient is used in all sections\.
Table 2:Performance metrics \(mean ± std\) over 5 runs for testing bridge nodes on three datasets\.AccuracyAUCPrecisionRecallF1Gao w/ Bridge0\.9180 ± 0\.00140\.9700 ± 0\.00230\.9316 ± 0\.01860\.8998 ± 0\.02230\.9152 ± 0\.0033Gao w/o Bridge0\.9062 ± 0\.00820\.9680 ± 0\.00070\.9348 ± 0\.01930\.8704 ± 0\.03620\.9008 ± 0\.0113C\.elegans w/ Bridge0\.9730 ± 0\.00620\.9958 ± 0\.00160\.9704 ± 0\.00770\.9744 ± 0\.00780\.9724 ± 0\.0067C\.elegans w/o Bridge0\.9736 ± 0\.00760\.9962 ± 0\.00130\.9720 ± 0\.00960\.9736 ± 0\.01020\.9728 ± 0\.0079Human w/ Bridge0\.9638 ± 0\.00330\.9926 ± 0\.00230\.9728 ± 0\.00570\.9530 ± 0\.01140\.9626 ± 0\.0039Human w/o Bridge0\.9606 ± 0\.00620\.9928 ± 0\.00190\.9550 ± 0\.01330\.9664 ± 0\.00520\.9606 ± 0\.0060Preprocessing is done exactly like the preprocessing that is implemented in BridgeDPI paper555[https://github\.com/enai4bio/BridgeDPI](https://github.com/enai4bio/BridgeDPI)\. Max sequence length of proteins and drugs are set at 1024 and 128, respectively\. As Fig\.[3](https://arxiv.org/html/2606.14245#S3.F3)shows,BridgeDPI as a complex model contains four inputs including:
- 1\.AminoSeq: primary sequence vectors of proteins,
- 2\.AminoCTR: concatanation of 1\-mer,2\-mer, and 3\-mer tensors of proteins,
- 3\.AtomFin: Morgan fingerprint vectors of drugs acquired from SMILES of drugs,
- 4\.AtomFea: graph features from SMILES for drugs\.
Figure 3:Overview of the BridgeDPI pipeline including inputs and different modules\.
## 4Experiments and Results
All experiments are conducted using default parameters of BridgeDPI paper which some of them are mentioned in Table[3](https://arxiv.org/html/2606.14245#S4.T3)\. If any change in parameters is done, it is mentioned in the corresponding section\. For gao dataset, training split is unique across runs, but for human and C\.elegans preprocess is performed 5 times to get different splits as it is random split\. Main performance measures may be different than the number that is reported in the original paper due to the differences in parameters and fine\-tuning\. Test set of Datasets are used for the post\-hoc explanations\. The Selected method for sampling is Stratified Sampling and procedures are repeated five times to get the final mean result\. Furthermore, batches and minibatches across runs are also used and average vectors between them are reported\.
Unlike other studies in which they test several case studies to prove the alignment of XAI output \(area attribution scores\) and biological explanations \(known pockets\) by comparing them, in this work scores are aggregated over the test set to find out the overall importance of the inputs for the model\.
Table 3:Hyperparameters of trained BridgeDPI model\. pCNN: protein TextCNN, pFcLinear: drug FFN, cFcLinear: drug FFN, dCNN:drug TextCNN, dFcLinear: drug FFN, fFcLinear: drug FFNFeatureSize of pCNN24Filter size of pCNN25Filter num of pCNN64Neuron num of pFcLinear128Neuron num of cFcLinear1024, 128Filter size of dCNN7Filter num of dCNN64Neuron num of dFcLinear128Neuron num of fFcLinear1024, 256, 128Num of bridge nodes64Neuron num of GNN128, 128, 128Decoder neuron num128, 1OptimizerAdam \(lr:0\.001\)Weight decay0\.001batch size512Number of epochs100Patience30Dropouts0\.5In general, layerwise gradient analysis is used to investigate the different sections of the main model\. According to bridgeDPI pipeline in Fig\.[3](https://arxiv.org/html/2606.14245#S3.F3), layers which their analysis could have important results are gcn\-output, gcn\-input, GNN module, and the input layer\. Other layers are also analyzed and have similar results to these four sections, therefore these results are not metioned further\.
### 4\.1Bridge node importance
This section discusses the importance of the GNN module in BridgeDPI framework\.
As Table[2](https://arxiv.org/html/2606.14245#S3.T2)shows model is trained both with bridge nodes and without bridge nodes to assess the effectiveness of the graph structure and graph neural network in the final prediction\. As the results show, the drop in performance measures are so little that it indicates the low contribution of bridge nodes to the prediction overally\.
Figure 4:Plots of first analysis in bridge node importance\.As Fig\.[4](https://arxiv.org/html/2606.14245#S4.F4)shows, the heatmaps summarize the batch\-averaged rectified cosine similarity matrix𝐂¯\\overline\{\\mathbf\{C\}\}between pre\-GCN nodes: protein \(index 0\), drug \(index 1\), and learned bridge rows \(≥2\\geq 2\)\. Entries are cosine similarities withℓ2\\ell\_\{2\}\-normalized node vectors, negative values removed, and the diagonal set to11; the figure masks the diagonal for color scaling and draws it separately, so off\-diagonal color reflects inter\-node coupling only\. Across Gao, Human, and C\.elegans, off\-diagonal mass is mostly low with scattered brighter pairs, indicating weak average coupling for most node pairs, a few stronger pairs, and no clear block or community structure; the overall sparsity pattern is qualitatively similar between corpora\.
The grouped bars plot node\-level summaries of the same matrix: the row sum and row mean of𝐂¯\\overline\{\\mathbf\{C\}\}over all node indices\. Values are comparatively uniform, with only modest peaks and troughs, which suggests that aggregate cosine mass is spread across coordinates rather than concentrated on a tiny hub subset, and that dataset\-to\-dataset shifts in node ranking are mild at this summary level\.
Connectivity histograms refer only to learned bridge nodes \(≥2\\geq 2\)\. Degree strength is the row sum of𝐂¯\\overline\{\\mathbf\{C\}\}with the diagonal zeroed \(sum of off\-diagonal rectified cosines\)\. Binary degree counts off\-diagonal entries that are strictly positive \(no extra threshold\)\. Distributions overlap substantially; Human shows a slightly heavier right tail in degree strength, while Gao is shifted toward somewhat larger binary degrees in the upper range, with Human and C\.elegans more central\. Overall, the bridge acts as a fairly dense rectified\-cosine scaffold with small dataset\-specific shifts in edge mass and sparsity rather than a qualitative change in layout\.
To relate these similarities to predictive behavior at the GCN, we analyze edges in the normalized adjacency actually used in message passing \(Fig\.[5](https://arxiv.org/html/2606.14245#S4.F5)\)\. Pre\-GCN node features are fixed \(detached\); the trainable part enters through an adjacency tensor𝐀\\mathbf\{A\}obtained from the same rectified cosines with ones on the diagonal\. The GCN uses𝐩𝐋=𝐃−1/2𝐀𝐃−1/2\\mathbf\{pL\}=\\mathbf\{D\}^\{\-1/2\}\\mathbf\{A\}\\mathbf\{D\}^\{\-1/2\}withDii=∑jAijD\_\{ii\}=\\sum\_\{j\}A\_\{ij\}, followed by the graph convolution and final regressor yielding a scalar scoreyyper sample\.
Edge\-level sensitivity scores are computed by treating𝐀\\mathbf\{A\}as a differentiable variable\. Unless noted otherwise, our implementation uses integrated gradients along straight\-line paths from an identity adjacency to𝐀\\mathbf\{A\}\(25 steps\), accumulated for up tomin\(B,128\)\\min\(B,128\)samples, averaged, and with the diagonal zeroed before visualization; alternatively, plain\|∂y/∂𝐀\|\|\\partial y/\\partial\\mathbf\{A\}\|can be used by changing the attribution mode\. The overlaid histograms show the distribution of these scores over off\-diagonal entries \(heavy tail: most edges near zero, a sparse subset much larger\)\.
For top\-KKinterventions \(K∈\{50,200\}K\\in\\\{50,200\\\}\), undirected off\-diagonal pairs are ranked separately for protein\-bridge edges \(incident to node 0\) and drug\-bridge edges \(incident to node 1\) using the same sensitivity scores\. For each family, theKKhighest\-ranked entries of𝐀\\mathbf\{A\}are set to zero symmetrically, the diagonal is restored to11,𝐩𝐋\\mathbf\{pL\}is rebuilt, andyyis recomputed without gradients\. We report the batch mean of\(ybase−yabl\)/\(\|ybase\|\+10−6\)\(y\_\{\\mathrm\{base\}\}\-y\_\{\\mathrm\{abl\}\}\)/\(\|y\_\{\\mathrm\{base\}\}\|\+10^\{\-6\}\)\. Positive values mean removal tends to decrease the score \(net supportive under this perturbation\); negative values mean removal tends to increase it \(net suppressive\)\.
The histograms are strongly heavy\-tailed on all three datasets, with Gao showing slightly more mass at small non\-zero sensitivities; the top\-KKbars add directionality\. ForK=50K\{=\}50, protein\-bridge removals are net suppressive on Gao and especially Human, and weakly suppressive on C\.elegans; drug\-bridge removals are strongly suppressive on Gao and C\.elegans but net supportive on Human\. ForK=200K\{=\}200, Gao protein\-bridge removals become net supportive while drug\-bridge removals stay suppressive; Human protein\-bridge removals stay suppressive; on C\.elegans, drug\-bridge removals stay suppressive and protein\-bridge removals become weakly supportive\.
Thus predictive sensitivity is concentrated on a small edge subset, while the net supportive versus suppressive role of the highest\-sensitivity protein\- versus drug\-bridge connections is dataset\-dependent; sign changes when increasingKKare expected because larger ablations interact nonlinearly through renormalization and the GCN stack\.
Overall, at the largerKKsetting illustrated in Fig\.[5](https://arxiv.org/html/2606.14245#S4.F5), Gao is broadly consistent with net supportive protein\-bridge mass together with net suppressive drug\-bridge mass, Human with the opposite pattern, and C\.elegans with weakly supportive protein\-bridge mass together with net suppressive drug\-bridge mass\.
Figure 5:Plots of Second analysis in bridge node importance\.Overall, the aggregate bridge visualizations suggest a fairly uniform rectified\-cosine scaffold rather than a sharply modular or community\-structured graph; correspondingly, the static node\-to\-node similarity maps alone carry limited discriminative story compared with the heavy\-tailed task\-dependent edge sensitivities at the GCN input\. This does not diminish the architectural rationale of Bridge\-DTI: the contribution is largely operational\-a batchable construction in which protein, drug, and learned bridge embeddings jointly induce a dense similarity matrix that can be symmetrized and fed through a GCN each step, enabling training on large interaction corpora without building an explicit pairwise graph per mini\-batch in an ad\-hoc way\. Conceptually, using rectified cosine affinities as nonnegative edge weights before normalized convolution resembles similarity\-gated message passing; however, it should not be equated with a general graph attention network \(GAT\), which typically learns attention coefficients with a distinct parameterization and normalization scheme\.
### 4\.2GCN input and output layers
Following sections discuss the analysis of GCN layer input and output\.
#### 4\.2\.1GCN output
Figure 6:GCN output result plots\.Analyzing the GCN output is useful because it characterizes how protein\- and drug\-side latent features are shaped by message passing before the final encoder produces the prediction\. Comparing these summaries to explanations at the GCN input \(and to the bridge similarity structure\) helps separate what the graph module does to each modality from what the raw encoders provide\.
We quantify the two sides with complementary summaries, each averaged over multiple random stratified subsamples for stability\. First, we report mean absolute attributions for the 128\-dimensional protein and drug latents using IG, plain saliency, LRP, SmoothGrad, and SmoothGrad\-IG, which summarize local sensitivity magnitude to perturbations on each side\. Second, we perform top\-kkablation by zeroing the highest\-ranked latent dimensions on each side \(separately\) and measuring the normalized change in the prediction; the sign separates net supportive effects \(positiveΔ\\Delta\) from net suppressive effects \(negativeΔ\\Delta\)\. Jointly, magnitude and signed ablation distinguish ”what the model reacts to strongly” from ”what tends to push the score up versus down” under coordinated removal\. Whenkkequals the full latent width \(128\), the procedure reduces to leave\-one\-side ablation of that modality at the GCN output\.
Mean\|attribution\|\|\\text\{attribution\}\|can be misleading in isolation if opposing influences cancel in the attributions; pairing it with signed top\-kk/leave\-one\-side ablation reduces that ambiguity\. These analyses are nonetheless layer\-local: they do not trace compensatory rerouting in earlier layers, nor do they remove dataset composition effects \(e\.g\., cold\-drug evaluation regimes\)\.
##### Results \(Fig\.[6](https://arxiv.org/html/2606.14245#S4.F6)\)
Gao\.Across methods, mean\|attribution\|\|\\text\{attribution\}\|is higher for drug latents than for protein latents, indicating greater local sensitivity to perturbations on the drug\-side GCN output\. The leave\-one\-side ablation is directionally consistent with a drug\-driven positive contribution: drug\-side removal yields a positive normalizedΔ\\Delta\(supportive latents\), whereas protein\-side removal yields a negativeΔ\\Delta\(net suppressive latents at this layer under joint top\-kkselection\)\.
Human\.For IG, saliency, SmoothGrad, and SmoothGrad\-IG, mean\|attribution\|\|\\text\{attribution\}\|is modestly larger on the drug side; LRP is the exception, assigning larger magnitude to the protein side\. Ablation nonetheless reverses the directional emphasis relative to Gao: protein\-side removal produces a positiveΔ\\Delta, while drug\-side removal produces a negativeΔ\\Delta\. Thus, Human is a case where gradient\-style magnitudes need not align with the net push/pull inferred from coordinated removal: the protein GCN output appears more supportive for the score, whereas the selected highly ranked drug latents behave more suppressively at this layer\.
C\.elegans\.Mean\|attribution\|\|\\text\{attribution\}\|is consistently larger on the drug side across methods \(the gap is especially visible for IG and SmoothGrad\-IG\), so perturbations of drug latents produce the strongest local attributions\. However, leave\-one\-side ablation does not follow the same ranking: protein\-side removal yields a positiveΔ\\Delta\(supportive\), whereas drug\-side removal yields a negativeΔ\\Delta\(suppresssive\)\. In other words, the worm model can be locally most sensitive to drug coordinates while the coordinated top\-kkmass on the drug side acts, on average, as a brake on the score; the protein side provides the net supportive offset in this ablation view\.
#### 4\.2\.2GCN input
Figure 7:Explanations at the GCN input \(pre\-message\-passing node tensor\)\.Top:mean absolute attributions \(IG, saliency, LRP, SmoothGrad, SmoothGrad\-IG\), averaged over stratified subsamples, for the three concatenated node groups\-protein \(node 0 embedding\), drug \(node 1 embedding\), and learned bridge rows\.Bottom:leave\-one\-side ablation on the concatenated nodes: mean normalizedΔ=\(ybase−yabl\)/\(\|ybase\|\+ε\)\\Delta=\(y\_\{\\mathrm\{base\}\}\-y\_\{\\mathrm\{abl\}\}\)/\(\|y\_\{\\mathrm\{base\}\}\|\+\\varepsilon\)when an entire side is removed\. PositiveΔ\\Deltaindicates that side is net supportive \(removal lowers the score\); negativeΔ\\Deltaindicates net suppressive \(removal raises the score\)\.The GCN input is the tensor formed by concatenating protein\-side, drug\-side, and learned bridge embeddings into a single node matrix before cosine\-based adjacency construction and graph convolution\. This layer is a natural checkpoint because it separates \(i\) how much gradient\-based attributions target each source of features from \(ii\) how much coordinated removal of an entire modality changes the prediction\.
We summarize the same five explanation methods used elsewhere \(IG, saliency, LRP, SmoothGrad, SmoothGrad\-IG\), averaging mean\|attribution\|\|\\text\{attribution\}\|over stratified subsamples for each node group\. Across Gao, Human, and C\.elegans, the learned bridge coordinates contribute only small mean magnitudes relative to protein and drug in every method shown: attributions concentrate on the two biological modalities, not on the bridge slot at this stage\. Method\-to\-method reorderings of protein versus drug are modest and dataset\-dependent \(e\.g\., protein peaks highest under saliency/LRP/SmoothGrad on Gao, whereas IG and SmoothGrad\-IG can slightly favor the drug group on some splits\), but none of these variants elevate bridge nodes to a dominant sensitivity channel in Fig\.[7](https://arxiv.org/html/2606.14245#S4.F7)\.
The leave\-one\-side panel provides a complementary directional readout that need not coincide with which side has larger\|attribution\|\|\\text\{attribution\}\|\. OnGaoandHuman, removing the protein side yields positive normalizedΔ\\Delta\(supportive\), whereas removing the drug side yields negativeΔ\\Delta\(suppresssive\), with Gao showing the larger magnitudes\. OnC\.elegans, the pattern reverses: drug\-side removal is strongly supportive \(positiveΔ\\Delta\), while protein\-side removal is strongly suppressive \(negativeΔ\\Delta\)\. Thus, bridge novelty should not be inferred from large bridge attributions at the GCN input in this visualization; instead, the plots emphasize that \(a\) sensitivity is dominated by protein/drug embeddings, while \(b\) the net push/pull of whole modalities under hard ablation is benchmark\-dependent and can invert between organism\-scale corpora and the large screening\-style collection\.
### 4\.3Input layers
The following subsections summarize explainability at each input modality\. To reduce dependence on any single attribution method, we combine gradient\- and propagation\-based scores with a method\-agnostic occlusion baseline on the same features\. Concretely, we compute one importance vector per feature from leave\-one\-feature \(or position/channel\) ablation with a fixed replacement rule \(zero or batch\-mean baseline, chosen consistently with the integrated\-gradient baseline where applicable\) and record a normalized prediction change\. Separately, we obtain rankings from five explanation techniques: saliency, Integrated Gradients \(IG\), Layer\-wise Relevance Propagation \(LRP\), SmoothGrad, and SmoothGrad\-IG\. For each modality and choice ofkk, we retain features in the intersection of the ablation top\-kkset and the top\-kksets of all listed methods\-a strict consensus that flags dimensions on which occlusion and every gradient\-based method agree\. Where noted, vectors are averaged over repeated stratified mini\-batches before overlap evaluation to stabilize rankings under sampling noise\.
The resulting per\-modality consensus lists \(and related summaries\) are reported in Table[4](https://arxiv.org/html/2606.14245#S4.T4); the AtomFin block follows the same logical pipeline but summarizes importance at the fingerprint level and additionally maps selected dimensions to BRICS\-style substructure strings, with an alternative aggregation when indicated \(e\.g\., summax vs\. weighted summaries\)\.
This design does not establish biological ground truth; it trades single\-method optimism for a conservative agreement criterion and keeps protein\-side, drug\-side, and sequence\- versus graph\- versus fingerprint\-native indices in separate spaces rather than forcing a single cross\-modal ranking\.
Table 4:Cross\-view agreement analysis across protein sequence \(kk\-mers and amino acids\) and drug representations \(atomFin substructures and atom features\) for the Gao, Human, andC\.elegansdatasets\. For each modality, the table reports the highest\- and lowest\-consensus features based on explanation agreement scores across views\. Top intersections indicate features consistently identified as highly important across multiple datasets\. Both marginal and per\-occurrence amino acid analyses are presented to distinguish overall feature importance from frequency\-normalized importance patterns\.DatasetTop 10 consensus kmersLow 10 consensus kmersGao’LAH’, ’RFS’, ’SRV’, ’TNG’, ’SYG’, ’YNY’, ’LEC’, ’ELN’, ’DVV’, ’VLL’’KHH’, ’WHA’, ’MRF’, ’RCH’, ’GDM’, ’NWH’, ’TQW’, ’EFR’, ’IPN’, ’DGM’Human’ESE’, ’VL’, ’VLV’, ’FV’, ’VSQ’, ’VLR’, ’SMQ’, ’RI’, ’EQL’, ’RIS’’HWV’, ’WIF’, ’WHR’, ’GCW’, ’ARC’, ’WPP’, ’RCE’, ’GMF’, ’NMH’, ’CEY’C\.elegans’RTF’, ’SM’, ’GIF’, ’TSI’, ’KRG’, ’STL’, ’LPL’, ’GDK’, ’FVK’, ’MLD’’PLC’, ’YWL’, ’NCR’, ’NSW’, ’WVD’, ’CPW’, ’PCE’, ’WYT’, ’QWV’, ’FPR’Top intersections’AES’, ’SGN’, ’VL’
DatasetTop 10 consensus aminoacidsLow 10 consensus aminoacidsMarginal analysisGao’V’, ’E’, ’L’, ’G’, ’¡EOS¿’, ’P’, ’Q’, ’A’, ’S’, ’K’’¡UNK¿’, ’W’, ’M’, ’Y’, ’H’, ’D’, ’N’, ’R’, ’F’, ’I’Human’L’, ’R’, ’S’, ’A’, ’V’, ’I’, ’E’, ’K’, ’G’, ’N’’¡UNK¿’, ’¡EOS¿’, ’W’, ’M’, ’Q’, ’H’, ’F’, ’D’, ’T’, ’P’C\.elegans’S’, ’I’, ’A’, ’K’, ’L’, ’V’, ’E’, ’D’, ’N’, ’R’’¡UNK¿’, ’¡EOS¿’, ’M’, ’H’, ’Q’, ’F’, ’T’, ’P’, ’Y’, ’G’Top intersections’L’, ’V’, ’S’Per\-occurrence analysisGao per\-occ’W’, ’V’, ’H’, ’Q’, ’E’, ’P’, ’F’, ’I’, ’G’, ’A’’¡UNK¿’, ’¡EOS¿’, ’S’, ’L’, ’D’, ’R’, ’K’, ’Y’, ’M’, ’N’Human per\-occ’W’, ’H’, ’N’, ’M’, ’R’, ’I’, ’Q’, ’F’, ’S’, ’K’’¡UNK¿’, ’¡EOS¿’, ’G’, ’L’, ’A’, ’P’, ’V’, ’E’, ’D’, ’T’C\.elegans per\-occ’Y’, ’H’, ’M’, ’Q’, ’S’, ’N’, ’F’, ’K’, ’I’, ’P’’¡UNK¿’, ’¡EOS¿’, ’L’, ’V’, ’E’, ’A’, ’T’, ’D’, ’G’, ’R’Top intersections’H’, ’Q’, ’F’
DatasetTop 15 consensus atomFin substructuresLow 5 consensus atomFin substructuresGao\-weighted”C”, ”O”, ”c\(c\)c”, ”c”, ”N”, ”c\(c\)\(c\)C”, ”c\(c\)n”, ”c\(cc\)\(cc\)C\(F\)\(F\)F”, ”c\(cc\)cc”, ”c\(cc\)c\(c\)c”, ”n”, ”c\(c\)\(c\)c”, ”c\(\-c\)\(c\)c”, ”C\(C\)C”, ”C\(c\)\(F\)\(F\)F””c\(c\(\-c\)c\)c\(c\)C”, ”n1ccsc1N”, ”C\(=O\)\(Cc\)N\(C\)C”, ”N\(C\)\(Cc\)C\(C\)=O”, ”O\(Cc\)Cc”Human\-weighted”C”, ”O”, ”N”, ”c\(c\)c”, ”C\(C\)C”, ”OC”, ”CC”, ”c”, ”n”, ”O\(C\)C”, ”C\(C\)\(C\)O”, ”C\(C\)\(C\)C”, ”O=C”, ”c\(c\)\(c\)C”, ”C\(C\)N””C\(=O\)\(O\)C\(c\)\(C\)C”, ”C\(CC\)\(CC\)\(c\(c\)c\)c\(c\)c”, ”s1c\(S\)ccc1S”, ”c\(N\)\(cc\)c\(c\)Cl”, ”c\(cc\)\(cc\)OP”C\.elegans\-weighted”O”, ”C”, ”c”, ”O=C”, ”N”, ”C\(C\)C”, ”OC”, ”c\(c\)c”, ”n”, ”CC”, ”C\(C\)\(=O\)O”, ”n\(c\)c”, ”C\(C\)O”, ”C\(C\)\(C\)N”, ”C\(C\)\(C\)O””C\(CC\)\(C\(C\)=O\)\(c\(c\)c\)c\(c\)c”, ”c\(nc\)\(c\(n\)N\)c\(n\)n”, ”C\(=O\)\(CN\)NC”, ”C\(C\)\(C\(=O\)O\)C\(C\)O”, ”O\(C\)C\(C\)O”
DatasetTop 10 consensus atom featuresLow 10 consensus atom featuresGao pos46123, 124, 125, 127Human posN/A127, 125, 126, 124C\.elegans pos1, 5, 4, 2, 3, 8, 7, 6, 9112, 117, 123, 124, 125, 126, 127, 94, 65, 86Gao71, 47, 46, 65, 56, 55, 69, 70, 0, 6631, 32, 33, 35, 37, 38, 39, 41, 51, 54Human0, 65, 46, 56, 69, 57, 47, 58, 7320, 24, 27, 31, 33, 39, 49, 50, 51, 52, 60C\.elegans56, 55, 0, 70, 45, 47, 46, 2, 71, 5727, 30, 31, 36, 39, 49, 52, 54, 60
#### 4\.3\.1aminoCtr \(protein k\-mer\) input
The protein branch represents each target as a fixed\-length vector of dense descriptors \(here,kk\-mer\-style counts or compositions\)\. We explain this layer directly at the raw input by applying the same suite of gradient\-based methods used elsewhere \(IG, saliency, LRP, SmoothGrad, SmoothGrad\-IG\), yielding one nonnegative importance score per vector dimension after the usual batch averaging and absolute\-value summaries\.
Independently of these gradients, we run feature\-wise occlusion ablation: one dimension at a time is replaced by a baseline value \(batch mean or zero\), chosen consistently with the integrated\-gradient baseline, and the change in the scalar prediction is recorded and normalized\. Absolute normalized effects define an ablation\-based ranking that is comparable across XAI methods on a common index set\.
For robustness to minibatch noise, attribution vectors and ablation vectors are averaged over repeated stratified subsamples\. We then form strict consensus sets at fixed cutoffs: a dimension is retained only if it lies in the intersection of the ablation top\-kkset and the top\-kksets of all gradient\-based methods\. These consensus dimensions \(Table[4](https://arxiv.org/html/2606.14245#S4.T4),kk\-mer block\) are interpretable askk\-mer tokens in the model’s vocabulary rather than as raw column indices; numeric indices are omitted from the main table becausekk\-mer orderings need not be aligned across benchmarks unless the same enumeration is explicitly shared\.
##### Results and interpretation
Across Gao, Human, and C\.elegans, high\-consensuskk\-mers mix small and hydrophobic motifs \(e\.g\. HumanVL/VLV/FV, GaoVLL/DVV, C\.elegansFVK/MLD\) with polar or charged triplets \(e\.g\.ESE,VSQ,KRG\)\. Such mixtures are plausibly consistent with composition patches enriched in aliphatic content together with occasional polar/charged spacers, but they should not be read as evidence of a single “critical peptide” without external validation \(mutagenesis, structure, or motif enrichment against negatives\)\.
Low\-consensus \(bottom\-ranked\)kk\-mers more often contain bulky or aromatic\-heavy letters \(e\.g\.W,H,Y,M,Fin Human lows such asHWV,WIF,WHR\)\. Rare tokens can disagree strongly between gradient attributions and occlusion; a small intersection at the low tail therefore reflects estimator disagreement and low frequency more than a proof that those motifs are biologically irrelevant\.
Dataset geometry matters for how these lists should be read: Gao’s highly hubbed protein distribution meanskk\-mer explanations aggregate many drug partners per recurrent target, which can wash organism\-specific signal toward broadly recurrent composition patterns; Human and C\.elegans may emphasize more corpus\-typical strings \(e\.g\. C\.elegansRTF,SM,GIF\), though the tokens remain model\-dependent summaries rather than guaranteed binding motifs\.
Where available, cross\-view overlap betweenkk\-mer consensus and amino\-acid\-level consensus \(Human:AES,SGN,VLin Table[4](https://arxiv.org/html/2606.14245#S4.T4)\) supports partial alignment between coarse composition features and residue\-level rankings on that corpus; absence of entries in other columns indicates that such cross\-view agreement was not observed under the same strict intersection criterion\.
#### 4\.3\.2AtomFin \(molecular fingerprint\) input
Drugs are represented as a fixed\-length fingerprint vector\. We attribute this layer with the same gradient\-based panel \(IG, saliency, LRP, SmoothGrad, SmoothGrad\-IG\) and complement it with one\-dimensional occlusion: each fingerprint dimension is replaced by a baseline \(batch mean or zero, consistent with the IG baseline choice\), and the induced change in the prediction is normalized using the same utilities as for aminoCtr \(relative scaling to the baseline score, optionalzz\-style scaling, and clipping when enabled\)\. After stratified multi\-run averaging, dimensions are ranked by mean absolute ablation effect and by each method; strict consensus at cutoffkkretains only indices in the intersection of the ablation top\-kkand the top\-kksets of all gradient methods\.
##### Mapping bits to chemistry
Raw molecular fingerprint dimensions are not chemically legible on their own\. Following the non\-aggregated attribution trace, active bits on held\-out molecules are mapped to localized substructures with RDKit \(SMARTS / fragment SMILES via bit information\), producing per\-occurrence records that are then aggregated for reporting\. We summarize substructure importance with a molecule\-aware reduction that limits repetition bias: for each fragment SMILES, we take the maximum attribution score within each molecule and then aggregate across molecules\. Table[4](https://arxiv.org/html/2606.14245#S4.T4)reports the weighted ranking \(sum of per\-molecule maxima, reweighted by molecule support\), which slightly favors fragments that recur across many compounds; the qualitative families agree with an unweighted sum\-max ranking in our runs, so only one column is shown in the main table\.
##### Results
Consensus fragments stratify according to the chemical statistics of each corpus\. On Gao, beyond ubiquitous one\- and two\-atom motifs \(C,O,c\), the strongest recurring substructures are aromatic\-rich: fluorinated aryl units \(C\(F\)\(F\)F\-containing patterns\), extended substituted benzenoids, aza\-aromatic linkages \(c\(c\)n\), and aryl\-ether / benzylic connectors \(O\(C\)c…\)\. This aligns with a screening\-heavy library where fused/heteroaromatic scaffolds and EWG\-substituted aromatics are common\.
OnHumanandC\.elegans, top fragments emphasize simpler aliphatic skeletons and oxygen\- and nitrogen\-containing functionality\. Branched alkyls \(C\(C\)C,C\(C\)\(C\)C\), alcohol/ether/carbonyl motifs \(OC,O=C,O\(C\)C\), and small heteroaromatic prefixes dominate\.C\. elegansshows a stronger tilt toward carboxylic/acid\-derivative patterns \(C\(C\)\(=O\)O\), consistent with more polar, metabolite\-like chemistry in that benchmark\.
Low\-consensusfragments \(bottom of the weighted lists\) are longer, more specific SMARTS that appear rarely; low rank here usually means weaker agreement or weaker aggregate contribution under the chosen aggregation, not a definitive claim that the substructure is ”anti\-binding” in a biochemical sense\.
##### Caveats
Dominance of very small fragments in the top rows is expected when common subgraphs appear in almost every molecule: they function as baseline ”currency” of the fingerprint map\. For finer chemical readouts, supplementary analyses can filter trivial singletons/doubletons or impose minimum heavy\-atom or ring counts before ranking; the full per\-bit JSONL trace remains the authoritative record for molecule\-level audits\.
#### 4\.3\.3aminoSeq \(protein sequence\) input
The protein sequence enters the model as a discrete token matrix \(amino\-acid identifiers plus special symbols\)\. We explain this modality in two aligned geometries: per\-position importance along the sequence, and\-when the analysis is anchored at the embedding stage\-per\-residue\-type importance aggregated over occurrences\.
Gradient\-based attributions \(IG with a chosen baseline, saliency, LRP when enabled, SmoothGrad, and SmoothGrad\-IG\) yield comparable score vectors over positions or over the 20 standard amino\-acid categories \(plus specials\)\. These are paired with occlusion\-style ablation at the embedding: selected positions or all occurrences of a residue type are replaced by a baseline embedding slice \(zero or batch\-averaged, depending on configuration\), and the normalized change in the predicted score is recorded\.
As elsewhere, vectors are averaged over repeated stratified mini\-batches\. Strict consensus at cutoffkkretains tokens that lie in the intersection of the ablation top\-kkranking \(by absolute normalized effect\) and the top\-kksets from every gradient\-based method, so reported highlights require simultaneous agreement between perturbation and gradients\.
##### Marginal vs\. per\-occurrence amino\-acid summaries
Table[4](https://arxiv.org/html/2606.14245#S4.T4)reports amino\-acid consensus in two normalizations\. Marginal rankings correlate strongly with proteome\-wide abundance: Leu, Ala, Ser, Gly, Val, and acidic or amide residues appear frequently in the high\-consensus rows across corpora\. The same block also lists special symbols \(<EOS\>\) for Gao and C\.elegans; these should be interpreted cautiously as pipeline signals \(padding, sequence termination, or truncation\) unless analyses are explicitly masked to valid sequence length\. Low\-consensus \(bottom\) sets concentrate on rare residues \(e\.g\. Trp, Met\) and on<UNK\>, which often flags tokenization or vocabulary edge cases rather than a clean biochemical role\.
The per\-occurrence block reweights each residue type by how often it appears before averaging effects; under this normalization, aromatic and polar side chains \(e\.g\. Trp, His, Gln/Asn, Phe/Tyr\-family behavior\) rise in Human and Gao, a pattern more compatible with interface\-enriched chemistry than with bulk composition alone\. C\.elegans shows a related shift \(e\.g\. Tyr, His, Met, Gln in the high\-consensus per\-occurrence row\)\.
##### Cross\-view consistency \(Human\)
On Human, marginal amino\-acid consensus intersects the strictkk\-mer consensus \(AES,SGN,VL\), and the marginal residue intersection column listsL,V,S\-overlapping aliphatic/polar characters with thekk\-mer table\. Per\-occurrence intersections \(H,Q,F\) further emphasize polar/aromatic involvement\. These overlaps support partial agreement between coarsekk\-mer features and sequence\-token explanations on that benchmark, without implying structural ground truth\.
#### 4\.3\.4AtomFea \(drug graph atom tensor\) input
Each drug is encoded as a padded tensor of shape \(batch×\\timesatoms×\\timeschannels\), where channels concatenate RDKit\-style local atomic descriptors\. We explain this layer along two non\-redundant axes that match the modeling tensor: which atom slots matter \(positional importance\) and which descriptor channels matter when contributions are aggregated over all heavy\-atom positions \(feature\-channel importance\)\.
Gradient\-based attributions \(IG, saliency, LRP, SmoothGrad, SmoothGrad\-IG\) are reduced to either a length\-NatomsN\_\{\\mathrm\{atoms\}\}vector \(mean\|⋅\|\|\{\\cdot\}\|over batch and channel\) or a length\-FFvector \(mean\|⋅\|\|\{\\cdot\}\|over batch and atom\), depending on whether we sum sensitivity across features or across positions\. Occlusion ablation is matched to the same geometry: either an entire atom column is replaced by a baseline slice, or a single feature channel is replaced across all atoms, and the normalized prediction change is recorded\. After repeated stratified subsampling, strict consensus at cutoffkkkeeps only indices that lie in the intersection of the ablation top\-kkset and the top\-kksets of all gradient methods; Table[4](https://arxiv.org/html/2606.14245#S4.T4)reports both positional and channel consensus separately\.
##### Channel interpretation
The channel ordering follows a fixed featurization \(one\-hot atom identity; one\-hot degree; one\-hot implicit valence; scalar formal charge; scalar radical count; one\-hot hybridization state; aromatic boolean; one\-hot total hydrogen count\):
1. 1\.indices0\-4343: atom\-type one\-hot;
2. 2\.4444\-5454: degree one\-hot \(0\-10\);
3. 3\.5555\-6161: implicit\-valence one\-hot \(0\-6\);
4. 4\.6262: formal charge \(integer\);
5. 5\.6363: radical\-electron count \(integer\);
6. 6\.6464\-6868: hybridization one\-hot \(SP,SP2,SP3,SP3D,SP3D2\);
7. 7\.6969: aromatic flag;
8. 8\.7070\-7474: total\-hydrogen one\-hot \(0\-4\)\.
In the reported consensus lists, high\-ranked channels concentrate in the valence, hybridization, hydrogen, and aromatic blocks rather than in rare heteroatom one\-hots\. This suggests that the model’s cross\-graph summary is more sensitive to coarse electronic structure and hydrogen\-count features, which correlate with conjugation and local geometry, than to exotic element identities at this resolution\.
##### Positional consensus and padding
Positional consensus is fragile relative to channels: on Gao the strict top\-kkposition list collapses to a single index in the reported table, whereas Human shows no top\-kkpositional intersection while several corpora agree on low\-importance slots near indices123123\-127127\. With fixed\-length padding, those high indices are plausibly dominated by padding or near\-padding atoms; interpret positional highs/lows only alongside valid\-length masks or a length\-stratified sensitivity check\. C\.elegans instead shows early\-index consensus \(11\-99\), which is consistent with shorter effective molecular occupancy in that benchmark \(early atoms carry more signal before the padded tail\), but the positional block should still be treated as secondary to the channel block for qualitative claims\.
##### Caveats
Channel consensus aggregates contributions across the entire graph tensor; it does not, by itself, localize a pharmacophore to a specific atom index\. Conversely, positional consensus can be confounded by convolution/pooling and padding; use the channel view for ”what chemistry the model measures” and the positional view for ”where that chemistry tends to occur,” subject to the padding caveat above\.
## 5Discussion
### 5\.1Is explainability for DTI models scientifically useful?
Explainable AI \(XAI\) for drug\-target interaction \(DTI\) and drug\-target affinity \(DTA\) prediction answers a practical question\-which inputs move this predictor\-rather than automatically answering a mechanistic question\-what biochemical process occurs at an interface\. Attribution methods \(e\.g\. integrated gradients, saliency, LRP, SmoothGrad variants\) summarize internal sensitivity patterns; they do not, by themselves, certify biological mechanism\. Accordingly, disagreement between attributions and experimentally mapped binding sites can indicate spurious correlations, representation bias, or optimization of proxy statistics under a given split design; it can also reflect genuine model behavior that is statistically predictive but not structurally supervised\.
This limitation does not render XAI pointless for DTI modeling\. Used conservatively, XAI turns ”black\-box accuracy” into auditable behavior: it exposes reliance on special tokens and padding, highlights modality dominance \(protein vs\. drug vs\. bridge\), and separates magnitude sensitivity from signed effects under coordinated ablation\-failure modes that are difficult to diagnose from accuracy alone\. In that sense, XAI is most viable when framed as model criticism and hypothesis generation \(what to mutate, which bits to probe, which splits stress\-test ”shortcut” features\), with external evidence \(structures, assays, enrichment against negatives\) required for mechanistic claims\. ultimately, our framework may help highlight molecular fragments whose influence on model predictions merits further structural or experimental evaluation during lead optimization\.
### 5\.2Relation to graph\-specific explainers
Graph explainer families \(e\.g\. GNNExplainer, PGExplainer\) target discrete graph structure when the graph is the primary learned interface\. In Bridge\-DTI\-style architectures, predictive signal is distributed across raw encoders, a similarity\-induced adjacency, and a shallow GCN; explainers that assume a single critical sparse subgraph may be a poor match when the model’s behavior is dominated by upstream fingerprints,kk\-mers, or dense rectified cosine coupling\. For our setting, differentiable adjacency sensitivities and targeted edge ablations at the GCN input provide a more direct alignment with how edges enter the forward pass than post\-hoc subgraph masks trained on a different inductive bias\. That said, for models where message passing is the first and dominant computation, learned\-edge explainers remain a natural tool\.
### 5\.3Generalization, split design, and what XAI can test
Strong validation metrics under high entity reuse \(warm proteins or warm drugs\) can co\-exist with explanations that emphasize repeated corpus motifs\. This is especially relevant when protein hubs concentrate many chemotypes \(as in large screening\-style benchmarks\) or when evaluation emphasizes cold drugs versus cold targets depending on the benchmark\. XAI is most informative when paired with split protocols that match the deployment question: cold\-drug evaluation stresses pharmacophoric novelty, whereas cold\-target evaluation stresses target\-side novelty\. Under cold settings, agreement \(or systematic disagreement\) between gradient attributions and occlusion provides a concrete checklist of potential shortcuts to test next\.
### 5\.4Representations and richer chemistry
Explanations inherit the ontology of the features\. Fingerprints,kk\-mers, and 2D graph featurizations surface pharmacophore\-like fragments and compositional motifs; they rarely localize a binding pose without additional structural context\. Incorporating explicit 2D/3D structure, protein pockets, or co\-complex priors can improve both model faithfulness and the interpretability of ”where” questions\-but it also shifts the evidentiary bar: explanations should then be evaluated against structural ground truth, not only against margin rankings on a benchmark\.
### 5\.5From single\-feature ablation to grouped interventions
Our analyses emphasize single\-dimension occlusion at raw inputs and coordinated top\-kkremovals at selected layers\. While informative, single\-feature ablation can under\-represent cooperative effects \(contiguous epitopes, fused ring systems, salt bridges\)\. Future work could explore group ablations: contiguous sequence segments, chemically connected subgraphs, pharmacophore clusters, or residue sets suggested by external site definitions\. Such interventions better match biological units of recognition and can be used to test whether the model’s sensitivity aligns with pockets and motifs rather than isolated dimensions\.
### 5\.6Conclusion and future work
We presented a multi\-pronged explanation protocol\-gradient attributions, occlusion ablation, strict intersection consensus across methods, and layer\-wise comparisons \(raw inputs, bridge similarity summaries, GCN input/output, edge\-level sensitivities\)\-to audit a Bridge\-DTI\-style predictor beyond headline metrics\.
Future directions include: \(i\) quantitative alignment scores between attributions and experimental binding\-site annotations or curated complex structures; \(ii\) grouped ablations and motif\-level hypothesis tests; \(iii\) cold\-split evaluations matched to the intended deployment regime; and \(iv\) integration of structural channels so that ”where” and ”why” claims can be checked against structural evidence\.
Beyond model auditing, these analyses may have practical utility in computational drug discovery workflows\. Attribution consistency and intervention\-based validation can help assess whether a predictor relies on chemically or biologically plausible patterns rather than dataset\-specific shortcuts\. Such analyses may assist hypothesis generation, guide downstream experimental prioritization, and improve confidence in model behavior under cold\-drug or cold\-target settings\. Importantly, these explanations do not establish biochemical mechanism or causal binding interactions on their own; instead, they provide a diagnostic layer that can help identify which representations, fragments, or sequence regions warrant further structural or experimental investigation\.
We encourage the DTI/DTA community to treat interpretability analyses as standard reporting for new architectures: not as a substitute for biochemistry, but as a routine stress test that clarifies what a model implements, what it ignores, and where it is likely to fail under distribution shift\.
## References
- \[1\]S\. Bach, A\. Binder, G\. Montavon, F\. Klauschen, K\. Müller, and W\. Samek\(2015\)On pixel\-wise explanations for non\-linear classifier decisions by layer\-wise relevance propagation\.PloS one10\(7\),pp\. e0130140\.Cited by:[§3\.1](https://arxiv.org/html/2606.14245#S3.SS1.p5.1)\.
- \[2\]G\. Deng, C\. Shi, R\. Ge, R\. Hu, C\. Wang, F\. Qin, C\. Pan, H\. Mao, and Q\. Yang\(2025\)Efficient substructure feature encoding based on graph neural network blocks for drug\-target interaction prediction\.Frontiers in Pharmacology16,pp\. 1553743\.Cited by:[§2\.1](https://arxiv.org/html/2606.14245#S2.SS1.p2.1)\.
- \[3\]K\. Y\. Gao, A\. Fokoue, H\. Luo, A\. Iyengar, S\. Dey, P\. Zhang,et al\.\(2018\)Interpretable drug target prediction using deep neural representation\.\.InIJCAI,Vol\.2018,pp\. 3371–3377\.Cited by:[§2\.1](https://arxiv.org/html/2606.14245#S2.SS1.p1.1),[§3\.2](https://arxiv.org/html/2606.14245#S3.SS2.p2.7)\.
- \[4\]M\. K\. Gilson, T\. Liu, M\. Baitaluk, G\. Nicola, L\. Hwang, and J\. Chong\(2016\)BindingDB in 2015: a public database for medicinal chemistry, computational chemistry and systems pharmacology\.Nucleic acids research44\(D1\),pp\. D1045–D1053\.Cited by:[§3\.2](https://arxiv.org/html/2606.14245#S3.SS2.p2.7)\.
- \[5\]M\. Gim, J\. Choe, S\. Baek, J\. Park, C\. Lee, M\. Ju, S\. Lee, and J\. Kang\(2023\)ArkDTA: attention regularization guided by non\-covalent interactions for explainable drug–target binding affinity prediction\.Bioinformatics39\(Supplement\_1\),pp\. i448–i457\.Cited by:[§2\.1](https://arxiv.org/html/2606.14245#S2.SS1.p1.1)\.
- \[6\]X\. Gong, Q\. Liu, J\. He, Y\. Guo, and G\. Wang\(2025\)Multigrandti: an explainable multi\-granularity representation framework for drug\-target interaction prediction\.Applied Intelligence55\(2\),pp\. 107\.Cited by:[§2\.1](https://arxiv.org/html/2606.14245#S2.SS1.p1.1)\.
- \[7\]M\. Karimi, D\. Wu, Z\. Wang, and Y\. Shen\(2019\)DeepAffinity: interpretable deep learning of compound–protein affinity through unified recurrent and convolutional neural networks\.Bioinformatics35\(18\),pp\. 3329–3338\.Cited by:[§2\.1](https://arxiv.org/html/2606.14245#S2.SS1.p1.1)\.
- \[8\]A\. Khodabandeh Yalabadi, M\. Yazdani\-Jahromi, N\. Yousefi, A\. Tayebi, S\. Abdidizaji, and O\. O\. Garibay\(2024\)Fragxsitedti: revealing responsible segments in drug\-target interaction with transformer\-driven interpretation\.InInternational Conference on Research in Computational Molecular Biology,pp\. 68–85\.Cited by:[§2\.1](https://arxiv.org/html/2606.14245#S2.SS1.p1.1)\.
- \[9\]R\. R\. Kotkondawar, S\. R\. Sutar, A\. W\. Kiwelekar, V\. J\. Kadam, and S\. M\. Jadhav\(2025\)A generative framework for enhancing drug target interaction prediction in drug discovery\.Scientific Reports15\(1\),pp\. 35588\.Cited by:[§2\.1](https://arxiv.org/html/2606.14245#S2.SS1.p3.1)\.
- \[10\]H\. Kurata and S\. Tsukiyama\(2022\)ICAN: interpretable cross\-attention network for identifying drug and target protein interactions\.Plos one17\(10\),pp\. e0276609\.Cited by:[§2\.1](https://arxiv.org/html/2606.14245#S2.SS1.p1.1)\.
- \[11\]C\. Li, J\. Mi, H\. Wang, Z\. Liu, J\. Gao, and J\. Wan\(2025\)MGMA\-dti: drug target interaction prediction using multi\-order gated convolution and multi\-attention fusion\.Computational Biology and Chemistry118,pp\. 108449\.Cited by:[§2\.1](https://arxiv.org/html/2606.14245#S2.SS1.p1.1)\.
- \[12\]S\. Li, F\. Wan, H\. Shu, T\. Jiang, D\. Zhao, and J\. Zeng\(2020\)MONN: a multi\-objective neural network for predicting compound\-protein interactions and affinities\.Cell systems10\(4\),pp\. 308–322\.Cited by:[§2\.1](https://arxiv.org/html/2606.14245#S2.SS1.p1.1)\.
- \[13\]J\. Liao, H\. Chen, L\. Wei, and L\. Wei\(2022\)GSAML\-dta: an interpretable drug\-target binding affinity prediction model based on graph neural networks with self\-attention mechanism and mutual information\.Computers in biology and medicine150,pp\. 106145\.Cited by:[§2\.1](https://arxiv.org/html/2606.14245#S2.SS1.p2.1)\.
- \[14\]B\. Liu, S\. Wu, J\. Wang, X\. Deng, and A\. Zhou\(2024\)Higraphdti: hierarchical graph representation learning for drug\-target interaction prediction\.InJoint European Conference on Machine Learning and Knowledge Discovery in Databases,pp\. 354–370\.Cited by:[§2\.1](https://arxiv.org/html/2606.14245#S2.SS1.p1.1)\.
- \[15\]H\. Liu, J\. Sun, J\. Guan, J\. Zheng, and S\. Zhou\(2015\)Improving compound–protein interaction prediction by building up highly credible negative samples\.Bioinformatics31\(12\),pp\. i221–i229\.Cited by:[§3\.2](https://arxiv.org/html/2606.14245#S3.SS2.p3.1)\.
- \[16\]S\. M\. Lundberg and S\. Lee\(2017\)A unified approach to interpreting model predictions\.Advances in neural information processing systems30\.Cited by:[§2\.1](https://arxiv.org/html/2606.14245#S2.SS1.p3.1)\.
- \[17\]N\. R\. Monteiro, C\. J\. Simões, H\. V\. Ávila, M\. Abbasi, J\. L\. Oliveira, and J\. P\. Arrais\(2022\)Explainable deep drug–target representations for binding affinity prediction\.BMC bioinformatics23\(1\),pp\. 237\.Cited by:[§2\.1](https://arxiv.org/html/2606.14245#S2.SS1.p2.1)\.
- \[18\]T\. M\. Nguyen, T\. P\. Quinn, T\. Nguyen, and T\. Tran\(2021\)Counterfactual explanation with multi\-agent reinforcement learning for drug target prediction\.arXiv preprint arXiv:2103\.12983\.Cited by:[§2\.1](https://arxiv.org/html/2606.14245#S2.SS1.p4.1)\.
- \[19\]X\. Ru, Q\. Zou, and C\. Lin\(2023\)Optimization of drug–target affinity prediction methods through feature processing schemes\.Bioinformatics39\(11\),pp\. btad615\.Cited by:[§2\.1](https://arxiv.org/html/2606.14245#S2.SS1.p3.1)\.
- \[20\]G\. Schwalbe and B\. Finzel\(2024\)A comprehensive taxonomy for explainable artificial intelligence: a systematic survey of surveys on methods and concepts\.Data Mining and Knowledge Discovery38\(5\),pp\. 3043–3101\.Cited by:[§2\.1](https://arxiv.org/html/2606.14245#S2.SS1.p3.1)\.
- \[21\]R\. R\. Selvaraju, M\. Cogswell, A\. Das, R\. Vedantam, D\. Parikh, and D\. Batra\(2017\)Grad\-cam: visual explanations from deep networks via gradient\-based localization\.InProceedings of the IEEE international conference on computer vision,pp\. 618–626\.Cited by:[§2\.1](https://arxiv.org/html/2606.14245#S2.SS1.p2.1)\.
- \[22\]Z\. Shi, Y\. Wang, P\. Weerawarna, J\. Zhang, T\. Richardson, Y\. Wang, and K\. Huang\(2025\)Structure\-aware compound\-protein affinity prediction via graph neural network with group lasso regularization\.arXiv preprint arXiv:2507\.03318\.Cited by:[§2\.1](https://arxiv.org/html/2606.14245#S2.SS1.p2.1)\.
- \[23\]A\. Shrikumar, P\. Greenside, and A\. Kundaje\(2017\)Learning important features through propagating activation differences\.InInternational conference on machine learning,pp\. 3145–3153\.Cited by:[§2\.1](https://arxiv.org/html/2606.14245#S2.SS1.p2.1)\.
- \[24\]K\. Simonyan, A\. Vedaldi, and A\. Zisserman\(2013\)Deep inside convolutional networks: visualising image classification models and saliency maps\.arXiv preprint arXiv:1312\.6034\.Cited by:[§2\.1](https://arxiv.org/html/2606.14245#S2.SS1.p2.1),[§3\.1](https://arxiv.org/html/2606.14245#S3.SS1.p2.1)\.
- \[25\]D\. Smilkov, N\. Thorat, B\. Kim, F\. Viégas, and M\. Wattenberg\(2017\)Smoothgrad: removing noise by adding noise\.arXiv preprint arXiv:1706\.03825\.Cited by:[§3\.1](https://arxiv.org/html/2606.14245#S3.SS1.p4.1)\.
- \[26\]Y\. Sun, Y\. Y\. Li, C\. K\. Leung, and P\. Hu\(2024\)INGNN\-dti: prediction of drug–target interaction with interpretable nested graph neural network and pretrained molecule models\.Bioinformatics40\(3\),pp\. btae135\.Cited by:[§2\.1](https://arxiv.org/html/2606.14245#S2.SS1.p1.1)\.
- \[27\]M\. Sundararajan, A\. Taly, and Q\. Yan\(2017\)Axiomatic attribution for deep networks\.InInternational conference on machine learning,pp\. 3319–3328\.Cited by:[§2\.1](https://arxiv.org/html/2606.14245#S2.SS1.p2.1),[§3\.1](https://arxiv.org/html/2606.14245#S3.SS1.p3.1)\.
- \[28\]M\. Tsubaki, K\. Tomii, and J\. Sese\(2019\)Compound–protein interaction prediction with end\-to\-end learning of neural networks for graphs and sequences\.Bioinformatics35\(2\),pp\. 309–318\.Cited by:[§3\.2](https://arxiv.org/html/2606.14245#S3.SS2.p3.1)\.
- \[29\]A\. Vefghi, Z\. Rahmati, and M\. Akbari\(2025\)Drug\-target interaction/affinity prediction: deep learning models and advances review\.Computers in Biology and Medicine196,pp\. 110438\.Cited by:[§1](https://arxiv.org/html/2606.14245#S1.p3.1)\.
- \[30\]M\. Wang, W\. Li, X\. Yu, Y\. Luo, K\. Han, C\. Wang, and Q\. Jin\(2023\)AffinityVAE: a multi\-objective model for protein\-ligand affinity prediction and drug design\.Computational Biology and Chemistry107,pp\. 107971\.Cited by:[§2\.1](https://arxiv.org/html/2606.14245#S2.SS1.p1.1)\.
- \[31\]G\. P\. Wellawatte, A\. Seshadri, and A\. D\. White\(2022\)Model agnostic generation of counterfactual explanations for molecules\.Chemical science13\(13\),pp\. 3697–3705\.Cited by:[§2\.1](https://arxiv.org/html/2606.14245#S2.SS1.p3.1)\.
- \[32\]Y\. Wu, M\. Gao, M\. Zeng, J\. Zhang, and M\. Li\(2022\)BridgeDPI: a novel graph neural network for predicting drug–protein interactions\.Bioinformatics38\(9\),pp\. 2571–2578\.Cited by:[§2\.2](https://arxiv.org/html/2606.14245#S2.SS2.p1.1)\.
- \[33\]Z\. Yang, W\. Zhong, L\. Zhao, and C\. Y\. Chen\(2021\)ML\-dti: mutual learning mechanism for interpretable drug–target interaction prediction\.The Journal of Physical Chemistry Letters12\(17\),pp\. 4247–4261\.Cited by:[§2\.1](https://arxiv.org/html/2606.14245#S2.SS1.p1.1)\.
- \[34\]Z\. Yang, W\. Zhong, L\. Zhao, and C\. Y\. Chen\(2022\)MGraphDTA: deep multiscale graph neural network for explainable drug–target binding affinity prediction\.Chemical science13\(3\),pp\. 816–833\.Cited by:[§2\.1](https://arxiv.org/html/2606.14245#S2.SS1.p2.1)\.
- \[35\]M\. Yazdani\-Jahromi, N\. Yousefi, A\. Tayebi, E\. Kolanthai, C\. J\. Neal, S\. Seal, and O\. O\. Garibay\(2022\)AttentionSiteDTI: an interpretable graph\-based model for drug\-target interaction prediction using nlp sentence\-level relation classification\.Briefings in Bioinformatics23\(4\),pp\. bbac272\.Cited by:[§2\.1](https://arxiv.org/html/2606.14245#S2.SS1.p1.1)\.
- \[36\]Z\. Ying, D\. Bourgeois, J\. You, M\. Zitnik, and J\. Leskovec\(2019\)Gnnexplainer: generating explanations for graph neural networks\.Advances in neural information processing systems32\.Cited by:[§2\.1](https://arxiv.org/html/2606.14245#S2.SS1.p2.1)\.
- \[37\]N\. Yousefi, M\. Yazdani\-Jahromi, A\. Tayebi, E\. Kolanthai, C\. J\. Neal, T\. Banerjee, A\. Gosai, G\. Balasubramanian, S\. Seal, and O\. Ozmen Garibay\(2023\)BindingSite\-augmenteddta: enabling a next\-generation pipeline for interpretable prediction models in drug repurposing\.Briefings in Bioinformatics24\(3\),pp\. bbad136\.Cited by:[§2\.1](https://arxiv.org/html/2606.14245#S2.SS1.p1.1)\.
- \[38\]X\. Zeng, K\. Zhong, P\. Meng, S\. Li, S\. Lv, M\. Wen, and Y\. Li\(2024\)MvGraphDTA: multi\-view\-based graph deep model for drug\-target affinity prediction by introducing the graphs and line graphs\.BMC biology22\(1\),pp\. 182\.Cited by:[§2\.1](https://arxiv.org/html/2606.14245#S2.SS1.p2.1)\.
- \[39\]L\. M\. Zintgraf, T\. S\. Cohen, T\. Adel, and M\. Welling\(2017\)Visualizing deep neural network decisions: prediction difference analysis\.arXiv preprint arXiv:1702\.04595\.Cited by:[§2\.1](https://arxiv.org/html/2606.14245#S2.SS1.p3.1)\.Similar Articles
MARD: Mirror-Augmented Reasoning Distillation for Mechanism-Level Drug-Drug Interaction Prediction
Introduces MARD, a 7B-parameter model for mechanism-level drug-drug interaction prediction using mirror-augmented reasoning distillation, achieving state-of-the-art accuracy at ~1% of frontier API cost and demonstrating genuine pharmacological reasoning over memorization.
Comparing Post-Hoc Explainable AI Methods for Interpreting Black-Box EEG Models in Depression Detection
This paper compares several post-hoc explainability methods applied to an InceptionTime model for EEG-based depression detection, finding partial convergence among methods while highlighting methodological variability and limitations.
Applied Explainability for Large Language Models: A Comparative Study
A comparative study evaluating three explainability techniques (Integrated Gradients, Attention Rollout, SHAP) on fine-tuned DistilBERT for sentiment classification, highlighting trade-offs between gradient-based, attention-based, and model-agnostic approaches for LLM interpretability.
Predictive Data Debugging: Reveal and Shape What Your Model Learns, Before You Train (11 minute read)
This research introduces a method using interpretability to predict which behaviors DPO will amplify or suppress from a preference dataset before training, enabling data debugging to prevent undesired effects. The technique achieves R²=0.9 prediction accuracy and is integrated into Goodfire's Silico platform.
Surrogate modeling for interpreting black-box LLMs in medical predictions
Researchers propose a surrogate modeling framework to quantify and interpret latent medical knowledge encoded in black-box LLMs, revealing both valid associations and persistent racial biases.