SoK: A Comprehensive Analysis of the Current Status of Neural Tangent Generalization Attacks with Research Directions
Summary
This paper presents a comprehensive analysis of the Neural Tangent Generalization Attack (NTGA) for data protection, including a taxonomy of related attacks, and discusses future research directions.
View Cached Full Text
Cached at: 05/14/26, 06:18 AM
# SoK: A Comprehensive Analysis of the Current Status of Neural Tangent Generalization Attacks with Research Directions
Source: [https://arxiv.org/html/2605.12792](https://arxiv.org/html/2605.12792)
Because of their remarkable performance, Deep Neural Networks \(DNNs\) are used in many different applications of various areas\(Goodfellowet al\.,[2016](https://arxiv.org/html/2605.12792#bib.bib75); Liet al\.,[2021](https://arxiv.org/html/2605.12792#bib.bib73); Poznyaket al\.,[2019](https://arxiv.org/html/2605.12792#bib.bib74)\)\. In order to achieve the best results out of DNNs, they need to be trained with a large set of data\. On the other hand, the Internet is comprised of an enormous amount of data that belong to different owners\. As a result of the desperate need for large datasets, DNNs practitioners tend to collect data from the Internet in an illegitimate manner\. Hence, data owners who are unwilling to share their information are demanding methods to protect their data from unauthorized exploitation\.
Fortunately, substantial research efforts have been carried out to avoid the unauthorized use of data through data protection approaches\. In 2018, Yuan & Wu\(Yuan and Wu,[2021](https://arxiv.org/html/2605.12792#bib.bib1)\)introduced such a data protection approach, the Neural Tangent Generalization Attack \(NTGA\), which is based on the Neural Tangent Kernels \(NTKs\)\(Jacotet al\.,[2018](https://arxiv.org/html/2605.12792#bib.bib2)\)\. NTGA is crafted by adding imperceptible perturbations to the training dataset\. Then, the model trained on perturbed dataset will perform poorly on a validation or test dataset\. Therefore, those data are worthless for DNN practitioners\. Since the perturbations are imperceptible to the human eye, data can be used for other regular purposes as usual\.
NTGA represents a significant advancement in data protection approaches for several reasons: \(1\) it addresses the transferability of data protection approaches across models; \(2\) it leverages Gaussian Processes through the NTK approximation of wide neural networks to generate data poisoning attacks; and \(3\) it demonstrates remarkable performance in protecting data compared to existing approaches at the time\. Its major importance lies in being the first\-ever clean\-label generalization attack underthe black\-box setting\. Why is the black\-box nature crucial? Because we cannot predict what model an unauthorized user might employ, the perturbations must be effective against any model\. Seven years after the introduction of the NTGA, we revisit its role in the domain of data protection approaches\. Our objective is to explicitly address the following question:What has been discovered about the NTGA since its proposal, and what future research directions does it suggest?Even though there are substantial surveys on adversarial attacks and data poisoning attacks, which may lead the way to data protection approaches, none of them explicitly discuss or review the NTGA\(Fanet al\.,[2022](https://arxiv.org/html/2605.12792#bib.bib55); Ahmed and Kashmoola,[2021](https://arxiv.org/html/2605.12792#bib.bib56); Wanget al\.,[2022](https://arxiv.org/html/2605.12792#bib.bib32); Tianet al\.,[2022](https://arxiv.org/html/2605.12792#bib.bib67); Longet al\.,[2022](https://arxiv.org/html/2605.12792#bib.bib69); Menget al\.,[2022](https://arxiv.org/html/2605.12792#bib.bib68); Liet al\.,[2025](https://arxiv.org/html/2605.12792#bib.bib104)\)\.
Though NTGA is a data protection approach, it can also be viewed as an attack against DNNs\. In this paper, we present a taxonomy of NTGA\-related attacks against DNNs while providing fundamental knowledge on each type of attack\. The taxonomy clearly positions NTGA within the broader landscape of attacks targeting DNNs\. Specifically, NTGA is a generalization attack under the category of data poisoning, aiming to degrade model performance by corrupting the training data\. Although generalization attacks have been studied for years, they face challenges when targeting diverse DNN architectures\. NTGA addresses this issue by incorporating NTK theory\. Due to its architecture\-agnostic nature, NTGA can be classified as a black\-box data poisoning attack, which is a rarely explored area\. To support this fact, we present a separate taxonomy focused on black\-box attacks against DNNs\.
In the literature, we found several studies that reveal essential features of NTGA from different perspectives which were not discussed in Yuan & Wu\(Yuan and Wu,[2021](https://arxiv.org/html/2605.12792#bib.bib1)\)\. Section[3](https://arxiv.org/html/2605.12792#S3)is organized based on them\. We provide a summary of this section in Table[1](https://arxiv.org/html/2605.12792#S1)\. First, we explore the performance of NTGA compared to other clean\-label generalization attacks \(data protection approaches\)\. When introducing NTGA, Yuan & Wu\(Yuan and Wu,[2021](https://arxiv.org/html/2605.12792#bib.bib1)\)only compared their attacks with DeepConfuse\(Fenget al\.,[2019](https://arxiv.org/html/2605.12792#bib.bib11)\)and Return Favour Attack \(RFA\)\(Chan\-Hon\-Tong,[2019](https://arxiv.org/html/2605.12792#bib.bib13)\)\. The standard practice when proposing a new attack is to compare the effectiveness of the proposed method with existing attacks\. Hence, it is essential to identify the strong and weak attacks so that the researchers can easily decide which attacks to consider in their experiments\. Our paper provides enough facts including experimental demonstrations for the researchers to make that decision on NTGA\. Next, we analyze the current status of the NTGA against its challenges, adversarial training, and data augmentation\. Other than analyzing the existing studies, we provide our own experimental evidences to explain our insights and conclusions mentioned in Table[1](https://arxiv.org/html/2605.12792#S1)\. After, we explore the linear separability of NTGA and other data protection approaches and how does it impact their robustness\. We provide experimental results to show that an authorized user can determine the whether data are perturbed based on their linear separability\. Furthermore, we investigate other attacks\(Jacotet al\.,[2018](https://arxiv.org/html/2605.12792#bib.bib2)\)that use Neural Tangent Kernels \(NTKs\), which serve as the building blocks of NTGA and identify their similarities with the NTGA\.
After evaluating the current status of NTGA, we conclude our findings by explicitly stating the pros and cons of the attack\. These conclusions provide insights to facilitate researchers in making decisions regarding the use of NTGA\. Then, we propose new insights and perspectives on NTGA that can be used for further research\. New research directions include analyzing the transferability of NTGA with other attacks, applying NTGA to real\-world data, investigating the performance of NTGA on distributed machine learning algorithms and unsupervised machine learning algorithms, and extending NTGA on other data types\.
Figure 1\.Attacks against DNNs can be categorized using several criteria\. We consider three major criteria that are also relevant to NTGA\. This classification illustrates the fact that NTGA belong to clean\-label black\-box generalization attack\.
The contributions of this paper can be summarized as follows:
- •We specify where NTGA lies in the large domain of attacks against DNNs\. In addition, we delineate NTGA’s role within the broader domain of attacks targeting DNNs\. Moreover, we present a taxonomy of black\-box attacks on DNNs to highlight the significance of NTGA among existing methods\.
- •We conduct a comprehensive analysis of the current status of NTGA and identify gaps not covered in Yuan & Wu\(Yuan and Wu,[2021](https://arxiv.org/html/2605.12792#bib.bib1)\)\. Our analysis reveals that several recently proposed data protection approaches outperform NTGA\. Through experiments, we reveal NTGA’s susceptibility to adversarial training and data augmentation, and explore strategies for addressing these limitations\. We also reveal that the linear separability of images perturbed by NTGA and related attacks increases their susceptibility to additional threats\.
- •Based on the challenges and key features of NTGA identified through our study, we propose several potential directions for future research on NTGA\.
## 2\.Background
We first discuss a classification of attacks against DNNs\. Then, we focus our discussion on clean\-label generalization attacks in this paper\. Besides NTGA, we also consider a few other clean\-label generalization attacks\. Furthermore, we study black\-box attacks against DNNs because NTGA plays a vital role in the context of black\-box attacks\. Moreover, we expand our discussion to adversarial training, one of the most powerful defenses against prevailing attacks\.
### 2\.1\.Attacks against Deep Neural Networks \(DNNs\)
Attacks against DNNs can be classified in different ways\. Fig\.[1](https://arxiv.org/html/2605.12792#S1.F1)shows a particular classification of these attacks focusing on data protection approaches, where one of the main criteria for classifying DNN attacks is according to their time of occurrence\. Based on the criterion, these attacks are classified into two categories asdata poisoning attacksandadversarial attacks\.
Data poisoning attacks: They occur during model training\. Data poisoning attacks manipulate training data such that the model performs poorly on clean test data\(Wenet al\.,[2023](https://arxiv.org/html/2605.12792#bib.bib87)\)\. These attacks corrupt DNNs and make them unable to be used for generalization purposes\. Various data poisoning attacks have been introduced so far, such as the error\-minimizing attack, the error\-maximizing attack, NTGA, and so on\(Fowlet al\.,[2021](https://arxiv.org/html/2605.12792#bib.bib10)\)\(Fenget al\.,[2019](https://arxiv.org/html/2605.12792#bib.bib11)\)\(Huanget al\.,[2021](https://arxiv.org/html/2605.12792#bib.bib12)\)\(Wanget al\.,[2021](https://arxiv.org/html/2605.12792#bib.bib31)\)\(Wuet al\.,[2022](https://arxiv.org/html/2605.12792#bib.bib47)\)\. Data poisoning attacks also refer ascausative attacks\(Machadoet al\.,[2020](https://arxiv.org/html/2605.12792#bib.bib16)\)\(Biggioet al\.,[2012](https://arxiv.org/html/2605.12792#bib.bib9)\)\. We can further classify data poisoning attacks into two categories as generalization attacks and integrity attacks\. Generalization attacks are also known asavailability attacks\(Zhao and Lao,[2022](https://arxiv.org/html/2605.12792#bib.bib48)\), and their main goal is to degrade overall model accuracy, including validation and test accuracy\. On the other hand, integrity attacks cause the model to misclassify on specific images in a clean test set and degrade the test accuracy of the model\.Poison Frogs\(Shafahiet al\.,[2018](https://arxiv.org/html/2605.12792#bib.bib41)\)is one of the major integrity attacks\(Huanget al\.,[2020](https://arxiv.org/html/2605.12792#bib.bib22)\)\(Zhuet al\.,[2019](https://arxiv.org/html/2605.12792#bib.bib40)\)\. Unlike generalization attacks, integrity attacks do not mitigate validation accuracy\.
Adversarial attacks: While data poisoning attacks harm the model at training time, adversarial attacks occur at test time\. Attackers add imperceptible perturbation to the test data so that pre\-trained models misclassify them with high confidentiality during test time\(Szegedyet al\.,[2014](https://arxiv.org/html/2605.12792#bib.bib8)\)\(Goodfellowet al\.,[2015](https://arxiv.org/html/2605.12792#bib.bib58)\)\. Adversarial attacks are also known asevasive attacks\(Machadoet al\.,[2020](https://arxiv.org/html/2605.12792#bib.bib16)\)\. Popular adversarial attacks include Projected Gradient Descent \(PGD\) attack\(Madryet al\.,[2017](https://arxiv.org/html/2605.12792#bib.bib15)\), Fast Gradient Sign Method \(FGSM\) attack\(Huanget al\.,[2017](https://arxiv.org/html/2605.12792#bib.bib25)\), DeepFool attack\(Moosavi\-Dezfooliet al\.,[2016](https://arxiv.org/html/2605.12792#bib.bib60)\)and Carlini and Wagner’s attack \(C&W\)\(Carlini and Wagner,[2017](https://arxiv.org/html/2605.12792#bib.bib26)\)\.
By definition, an adversarial attack is a mappingα:Rn→Rn\\alpha:R^\{n\}\\rightarrow R^\{n\}such that adversarial exampleα\(x\)=x′\\alpha\(x\)=x^\{\\prime\}is misclassified as to a class other than its original classyyby the modelff\. The difference betweenx′x^\{\\prime\}andxxis trivial, i\.e\.,∥x−x′∥p≤ϵ\\lVert x\-x^\{\\prime\}\\rVert\_\{p\}\\leq\\epsilonfor some small valueϵ\\epsilon\(Linet al\.,[2021](https://arxiv.org/html/2605.12792#bib.bib66)\)\. For a better understanding of adversarial attacks, as an example, we discuss the FGSM\(Huanget al\.,[2017](https://arxiv.org/html/2605.12792#bib.bib25)\)attack, where adversarial examplex′x^\{\\prime\}is crafted by solving the following optimization problem:
maxx′ℒ\(f\(x′,y\)\)\\max\_\{x^\{\\prime\}\}\\mathcal\{L\}\(f\(x^\{\\prime\},y\)\)subject to
\(1\)‖x−x′‖∞≤ϵ,\\\|x\-x^\{\\prime\}\\\|\_\{\\infty\}\\leq\\epsilon,wherex′x^\{\\prime\}is generated by maximizing the loss function of DNN modelff\. At the same time, the attacker is trying to minimize the dissimilarity between the adversarial example and the original data\. Hence, a trained model will misclassifyx′x^\{\\prime\}even though it looks likexx, where a human can correctly classify it as classyy\.
Furthermore, we can categorize attacks against DNNs, considering if the attacker manipulates the input’s target labels\. If the attacker changes the target labels when building the attack, we call themdirty\-label attacks\. On the other hand, if the attacker is not influencing the labels of the inputs, those attacks are calledclean\-label attacks\. According to the attacker’s knowledge regarding the target model, attacks can also be classified into three categories\. \(1\) Inblack\-box attacks, the attacker has no information about the targeted model\. \(2\) If the attacker has partial knowledge about the model, not exact information such as weights, those attacks are calledgray\-box attacks\. \(3\)White\-box attackshappen when the attacker has complete knowledge of the targeted model\. However, the classification of attacks against DNNs is not restricted to these three criteria in Fig\.[1](https://arxiv.org/html/2605.12792#S1.F1)\(Vorobeychik and Kantarcioglu,[2018](https://arxiv.org/html/2605.12792#bib.bib17)\)\(Wanget al\.,[2022](https://arxiv.org/html/2605.12792#bib.bib32)\)\. This paper only focuses on these three criteria relevant to NTGA\. As illustrated in Fig\.[1](https://arxiv.org/html/2605.12792#S1.F1), NTGA can be categorized as clean\-label, black\-box generalization attacks\.
Next, we thoroughly explore clean\-label generalization attacks\. Notations in Table\.[6\.1](https://arxiv.org/html/2605.12792#S6.SS1)in the Appendixare helpful to refer the next sections\. Let a training set is denoted byXDX^\{D\}and target labels of the training set is given byYDY^\{D\}\. A validation set is denoted byXVX^\{V\}and labels of the validation set is given byYVY^\{V\}\. The set of perturbations given toXDX^\{D\}is denoted byδ\\delta\. Perturbations are bounded by theLpL\_\{p\}norm; i\.e\.,‖δ‖p≤ϵ\\\|\\delta\\\|\_\{p\}\\leq\\epsilon\. Note thatffdenotes the target model andθ\\thetarepresent the set of model parameters\.ℒ\\mathcal\{L\}represents the model’s loss function\.
###### Definition 2\.0 \(Clean\-label generalization attacks\)\.
Clean\-label generalization attack is crafted by solving the following bi\-level optimization problem\(Yuan and Wu,[2021](https://arxiv.org/html/2605.12792#bib.bib1)\)\(Seguraet al\.,[2022](https://arxiv.org/html/2605.12792#bib.bib7)\)\(Chan\-Hon\-Tong,[2019](https://arxiv.org/html/2605.12792#bib.bib13)\):
subject to
\(2\)θ∗=argminθ\[ℒ\(f\(XD\+δ;θ\),YD\)\]\\theta^\{\*\}=\\arg\\min\_\{\\theta\}\\left\[\\mathcal\{L\}\(f\(X^\{D\}\+\\delta;\\theta\),Y^\{D\}\)\\right\]
As we explained before, generalization attacks perform poorly not only in a test dataset but also in a validation set\. Fig\.[2](https://arxiv.org/html/2605.12792#S2.F2)illustrates the function of perturbations generated by generalization attacks\.There are two groups: adata ownerand anunauthorized user\. As shown at the top of Fig\.[2](https://arxiv.org/html/2605.12792#S2.F2), the data owner adds imperceptible perturbations to the original data\. These perturbations are generated through clean\-label generalization attacks\. The perturbed data are also referred to asunlearnable data\. An unauthorized user uses the perturbed data to train a DNN model\. The model may achieve high training accuracy; however, when it is used to make predictions on unseen data without perturbations \(i\.e\., test/validation data\), it produces incorrect predictions\. This is because the DNN model cannot effectively learn from the training data due to the unlearnable perturbations\. Therefore, the data become useless for an unauthorized user when training models\.The data owner crafts the perturbations by minimizing the training loss and maximizing the validation loss\. Under black\-box settingffin EquationLABEL:eq:data\_po1is unknown\.ffcan be any unknown target model decided by unauthorized person\. The NTGA is crafted by representingffusing mean of Gaussian processes along with the NTK\. Next, we explain NTGA\.
Figure 2\.The function of perturbations generated by generalization attacks including NTGA\.
### 2\.2\.Neural Tangent Generalization Attacks \(NTGA\)
It is a known fact that infinite width neural networks and Gaussian processes are equivalent at initialization\(Leeet al\.,[2017](https://arxiv.org/html/2605.12792#bib.bib18)\)\. In a fully connected neural network, when the limit of widths of hidden layers goes to infinity, the network function at initialization converges to a Gaussian distribution related to a kernel\. Moreover, it is identified that during the training of neural network, the evolution of a neural network function can be described by the kernel: NTK\(Jacotet al\.,[2018](https://arxiv.org/html/2605.12792#bib.bib2)\)\.
###### Definition 2\.0\.
The Neural Tangent Kernel \(NTK\) between two inputsxix\_\{i\}andxjx\_\{j\}coming from a neural network with an infinite width limit and random initialization converges to the following kernel function\(Jacotet al\.,[2018](https://arxiv.org/html/2605.12792#bib.bib2)\):
\(3\)K\(xi,xj\)=𝔼θ∇θf\(xi;θ\)T∇θf\(xj;θ\)\.K\(x\_\{i\},x\_\{j\}\)=\\mathbb\{E\}\_\{\\theta\}\\nabla\_\{\\theta\}f\(x\_\{i\};\\theta\)^\{T\}\\nabla\_\{\\theta\}f\(x\_\{j\};\\theta\)\.
As seen, a kernel function is related to the above covariance function\. The NTK between two inputs refers to the similarity of the two input data\. Hence, an approximation to the range of neural networks can be established combining the two concepts\(Jacotet al\.,[2018](https://arxiv.org/html/2605.12792#bib.bib2)\)\. As a result, we can rewrite the neural network functionffin EquationLABEL:eq:data\_po1using Gaussian processes, along with the NTK\. Then, the maximization problem in EquationLABEL:eq:data\_po1can be modified as\(Yuan and Wu,[2021](https://arxiv.org/html/2605.12792#bib.bib1)\):
\(4\)argmax‖δ‖p≤ϵ\[ℒ\(f¯\(XV;K^V,D,K^D,D,YD,t\),YV\)\],\\arg\\max\_\{\\\|\\delta\\\|\_\{p\}\\leq\\epsilon\}\\left\[\\mathcal\{L\}\(\\bar\{f\}\(X^\{V\};\\hat\{K\}^\{V,D\},\\hat\{K\}^\{D,D\},Y^\{D\},t\),Y^\{V\}\)\\right\],wheref¯\\bar\{f\}denotes the mean of Gaussian processes,K^D,D\\hat\{K\}^\{D,D\}is the kernel matrix of training dataset, andK^V,D\\hat\{K\}^\{V,D\}is the kernel matrix between validation dataset and training dataset\. The hyperparameterttcontrols the time of attack happens\. The NTGA is crafted by solving the above optimization problem using projected gradient ascent\.
Throughout our study, we involve several other clean\-label generalization attacks to understand the current status of NTGA among its competitors\. We briefly discuss those attacks in the following section for a better understanding\.
### 2\.3\.Other Clean\-label Generalization Attacks
#### 2\.3\.1\.Error\-maximizing attack \(Adversarial poisoning\)
The perturbed images generated by adversarial attacks are known as adversarial examples\(Goodfellowet al\.,[2015](https://arxiv.org/html/2605.12792#bib.bib58)\)\. Fowl et al\.\(Fowlet al\.,[2021](https://arxiv.org/html/2605.12792#bib.bib10)\)revealed that adversarial examples could be more effective as data poisoning attacks\. They used a different approach from the usual data poisoning attack formulated in DefinitionLABEL:eq:data\_po1\. Simply, they applied adversarial attacks mechanism on a fixed pre\-trained model to generate poisons\. This approach can be formulated as follows:
\(5\)argmax‖δ‖p≤ϵ\[ℒ\(f\(XD\+δ;θ∗\),YD\)\]\.\\arg\\max\_\{\\\|\\delta\\\|\_\{p\}\\leq\\epsilon\}\\left\[\\mathcal\{L\}\(f\(X^\{D\}\+\\delta;\\theta^\{\*\}\),Y^\{D\}\)\\right\]\.In contrast to NTGA, error\-maximizing attacks do not follow a bi\-level optimization process\.θ∗\\theta^\{\*\}in Equation[5](https://arxiv.org/html/2605.12792#S2.E5)represents the model parameters trained over clean data, and it is fixed during the generation of perturbations\. We can see that this optimization problem is the same as the Equation[2\.1](https://arxiv.org/html/2605.12792#S2.Ex1), which is used to craft adversarial attack\. Fowl et al\.\(Fowlet al\.,[2021](https://arxiv.org/html/2605.12792#bib.bib10)\)generated perturbations by solving this optimization problem using Projected Gradient Descent \(PGD\)\.
#### 2\.3\.2\.Error\-minimizing attack
Error\-minimizing attack makes data unlearnable for DNNs by adding a type of imperceptible noise based on the opposite direction of the optimization problem of the error\-maximizing attack\. This noise is generated in a way such that the error of one or more training examples is made closer to zero\. Therefore, it tricks the model by indicating that there is nothing to learn from these examples\. Huang et al\.\(Huanget al\.,[2021](https://arxiv.org/html/2605.12792#bib.bib12)\)presented the following bi\-level optimization problem to generate the attack\.
\(6\)argminθ\[min‖δ‖p≤ϵℒ\(f\(XD\+δ;θ\),YD\)\]\.\\arg\\min\_\{\\theta\}\\left\[\\min\_\{\\\|\\delta\\\|\_\{p\}\\leq\\epsilon\}\\mathcal\{L\}\(f\(X^\{D\}\+\\delta;\\theta\),Y^\{D\}\)\\right\]\.Note that both optimization problems in Equation[6](https://arxiv.org/html/2605.12792#S2.E6)have the same objective of minimizing the loss\. The inner optimization problem finds perturbationsδ\\deltathat minimize the training loss\. The outer optimization problem finds the parameters that minimize the same loss\. In this case, they optimizeδ\\deltaover the training set after every M steps of optimizingθ\\theta\. Huang et al\.\(Huanget al\.,[2021](https://arxiv.org/html/2605.12792#bib.bib12)\)solved the inner minimization problem using PGD\.
#### 2\.3\.3\.DeepConfuse attack
DeepConfuse is another data poisoning attack that we encounter throughout this paper\. When solving optimization problem inLABEL:eq:data\_po1, Feng et al\.\(Fenget al\.,[2019](https://arxiv.org/html/2605.12792#bib.bib11)\)used a encoder\-decoder neural networkgξg\_\{\\xi\}to generate the noise\. Hence, we can rewrite the Eq\.LABEL:eq:data\_po1as below\.
argmaxϵ\[ℒ\(f\(XD;θ∗\),YD\)\\arg\\max\_\{\\epsilon\}\\left\[\\mathcal\{L\}\(f\(X^\{D\};\\theta^\{\*\}\),Y^\{D\}\)subject to
\(7\)θ∗=argminθ\[ℒ\(f\(XD\+gξ\(XD\);θ\),YD\)\]\.\\theta^\{\*\}=\\arg\\min\_\{\\theta\}\\left\[\\mathcal\{L\}\(f\(X^\{D\}\+g\_\{\\xi\}\(X^\{D\}\);\\theta\),Y^\{D\}\)\\right\]\.Feng et al\.\(Fenget al\.,[2019](https://arxiv.org/html/2605.12792#bib.bib11)\)alternatively updatedffon the perturbed data using the gradient descent method and updated the weights ofgξg\_\{\\xi\}on the clean data using gradient ascent\.
### 2\.4\.Adversarial Training
Adversarial training is a powerful remedy for most attacks, including data protection approaches such as NTGA\. Adversarial training is training DNNs on adversarial examples so that the models will not be vulnerable to adversarial attacks at test time\. Adversarial examples are crafted using adversarial attacks such as PGD attacks\(Madryet al\.,[2017](https://arxiv.org/html/2605.12792#bib.bib15)\)\. Even though adversarial training is initially introduced to defend against adversarial attacks, it is shown that adversarial training can also be used against data poisoning attacks\(Fuet al\.,[2022](https://arxiv.org/html/2605.12792#bib.bib4)\)\(Geipinget al\.,[2022](https://arxiv.org/html/2605.12792#bib.bib27)\)\(Wanget al\.,[2021](https://arxiv.org/html/2605.12792#bib.bib31)\)\. Mathematically, adversarial training can be formulated using optimization problem:
\(8\)minθ\[max‖δ‖p≤ρaℒ\(f\(XD\+δ\),YD;θ\)\],\\min\_\{\\theta\}\\left\[\\max\_\{\\\|\\delta\\\|\_\{p\}\\leq\\rho\_\{a\}\}\\mathcal\{L\}\(f\(X^\{D\}\+\\delta\),Y^\{D\};\\theta\)\\right\],whereρa\>0\\rho\_\{a\}\>0denotes the adversarial perturbation radius\(Madryet al\.,[2017](https://arxiv.org/html/2605.12792#bib.bib15)\)which is the maximum perturbation allowed\. Sometimes, adversarial perturbation radius is also called a defense budget\(Taoet al\.,[2022](https://arxiv.org/html/2605.12792#bib.bib3)\)\. A larger perturbation radius will give stronger adversarial robustness, but it will result in more visible perturbations\. The inner maximization problem in Equation[8](https://arxiv.org/html/2605.12792#S2.E8)finds a perturbation for each input that gives maximum loss, and the outer optimization problem finds the model parameters that minimize the adversarial loss in the inner problem\. We can see that adversarial training allows a certain perturbation limit for each input while training\. Therefore, the model cannot be tricked by adversarial inputs and training time perturbations\(Huanget al\.,[2020](https://arxiv.org/html/2605.12792#bib.bib22)\)\.
\{forest\}
Figure 3\.The taxonomy of attacks under the black\-box setting is displayed\. By exploring example attacks under each type of black\-box attack, we identify that there is a lack of black\-box attacks under clean\-label generalization attacks\. We explicitly show how NTGA are fulfilling that gap in the taxonomy\.
### 2\.5\.What make NTGA different from other data protection approaches?
The main difference between NTGA and other clean\-label generalization attacks mentioned in Section[2\.3](https://arxiv.org/html/2605.12792#S2.SS3)is being a black\-box attack\. Applying Gaussian processes and NTKs to approximate the entire wide neural networks makes NTGA a successful black\-box attack\. Other data protection approaches described in[2\.3](https://arxiv.org/html/2605.12792#S2.SS3)use pre\-determined models when generating the attack, which may result in the lack of transferability of their perturbations\. Yuan & Wu\(Yuan and Wu,[2021](https://arxiv.org/html/2605.12792#bib.bib1)\)claimed that the NTGA is the first clean\-label generalization attack under the black\-box setting, to which we agree\. Even though there are several black\-box integrity attacks, such as back\-door attacks\(Chenet al\.,[2017](https://arxiv.org/html/2605.12792#bib.bib24)\)and Label noise\(van Rooyenet al\.,[2015](https://arxiv.org/html/2605.12792#bib.bib23)\), black\-box generalization attacks are rarely explored\.Poisonous label attack\(Liuet al\.,[2021a](https://arxiv.org/html/2605.12792#bib.bib21)\)is a black\-box generalization attack, but unlike NTGA, it is a dirty\-label attack\. MetaPoison\(Huanget al\.,[2020](https://arxiv.org/html/2605.12792#bib.bib22)\)and Poison Frogs\(Shafahiet al\.,[2018](https://arxiv.org/html/2605.12792#bib.bib41)\)are clean\-label integrity attacks but cannot be considered as black\-box attacks\. However, there are several black\-box attacks in the context of adversarial attacks, which we will not thoroughly explore in this paper\(Papernotet al\.,[2017](https://arxiv.org/html/2605.12792#bib.bib19)\)\(Bhagojiet al\.,[2017](https://arxiv.org/html/2605.12792#bib.bib20)\)\. Fig\.[3](https://arxiv.org/html/2605.12792#S2.F3)clearly shows a visual depiction taxonomy of attacks against DNNs under black\-box settings\. Using taxonomy in Fig\.[3](https://arxiv.org/html/2605.12792#S2.F3), we verify that NTGA fulfill the lack of black\-box attacks under clean\-label generalization attacks\.
Other data protection approaches, such as watermarking\(Sharmaet al\.,[2024](https://arxiv.org/html/2605.12792#bib.bib105)\)and membership inference attacks\(Shokriet al\.,[2016](https://arxiv.org/html/2605.12792#bib.bib106)\), have different objectives from NTGA\. Watermarking is designed to embed identifiable information such as, ownership or authentication marks into images, whereas NTGA aims to corrupt the training process of a DNN model and reduce its generalization ability\. Specifically, watermarking is not intended have a negative impact on a DNN model’s performance\. In contrast, membership inference attacks aim to determine whether a particular image was included in a model’s training dataset, allowing data owners to verify if their data has been used for training a specific model\. NTGA, on the other hand, takes a different approach: it allows others to use the protected data for training, but the generated noise corrupts the learning process, making the data ineffective for training purposes\. As a result, users are discouraged from using such data\.
## 3\.Current Status of NTGA
In this section, we elaborate on the current status of NTGA from five aspects\. None of these perspectives were discussed in Yuan & Wu\(Yuan and Wu,[2021](https://arxiv.org/html/2605.12792#bib.bib1)\)when introducing NTGA\. At the end of the section, we present a summary of the pros and cons of NTGA\.
### 3\.1\.Performance of NTGA compared to Other Attacks
Table 2\.Test accuracy of a pretrained ResNet50 trained on CIFAR\-10 protected by each data protection approaches\.Yuan & Wu\(Yuan and Wu,[2021](https://arxiv.org/html/2605.12792#bib.bib1)\)compared NTGA with the DeepConfuse\(Fenget al\.,[2019](https://arxiv.org/html/2605.12792#bib.bib11)\)attack and the Return Favour Attack \(RFA\)\(Chan\-Hon\-Tong,[2019](https://arxiv.org/html/2605.12792#bib.bib13)\), which were the baselines of their study\. According to their results, the NTGA outperforms both the DeepConfuse attack and the RFA in a black\-box setting but not in a gray\-box setting\. Later, several studies proposed new attacks and compared the attacks directly with NTGA in performance\. Those studies revealed further information about the performance of NTGA\.
Yu et al\.\(Yuet al\.,[2021](https://arxiv.org/html/2605.12792#bib.bib5)\)studied the performance of several data protection approaches, including NTGA\. Their results shows that the error\-minimizing attack, the error\-maximizing attack, synthetic perturbations and the DeepConfuse attack perform better than the NTGA based on the test accuracy\. Furthermore, by comparing the test accuracies reported in Fu et al\.\(Fuet al\.,[2022](https://arxiv.org/html/2605.12792#bib.bib4)\), it was shown that the error\-minimizing attack outperforms NTGA, while the error\-maximizing attack does not\. The Robust Error\-Minimizing \(REM\) approach introduced by Fu et al\.\(Fuet al\.,[2022](https://arxiv.org/html/2605.12792#bib.bib4)\)appears to offer a slight improvement over NTGA in terms of data protection\.NTGA demonstrated performance comparable to the recently introduced Enhanced Unlearnable Examples \(EUN\)\(Chenet al\.,[2025](https://arxiv.org/html/2605.12792#bib.bib92)\), which are generated using convolutional operations and single pixel\-level modifications\. However, unlike EUN, NTGA exhibits performance degradation when models are trained with adversarial training\. A similar vulnerability is observed in NTGA against Stable Error\-Minimizing noise \(SEM\)\(Liuet al\.,[2024b](https://arxiv.org/html/2605.12792#bib.bib98)\), another recently proposed data protection approach\. SEM modifies the REM approach by training defensive noise against random perturbations instead of adversarial perturbations\. Overall, while NTGA is competitive with current data protection approaches, its vulnerability to adversarial training remains a significant limitation\.
However, some recently proposed approaches, such as autoregressive perturbations\(Seguraet al\.,[2022](https://arxiv.org/html/2605.12792#bib.bib7)\)and the One\-Pixel shortcut attack\(Wuet al\.,[2022](https://arxiv.org/html/2605.12792#bib.bib47)\), have not been directly compared with NTGA\. Moreover, when reviewing multiple studies, it is difficult to draw definitive conclusions about their performance\. For instance, Yu et al\.\(Yuet al\.,[2021](https://arxiv.org/html/2605.12792#bib.bib5)\)reported that the error\-maximizing attack outperforms NTGA, whereas Fu et al\.\(Fuet al\.,[2022](https://arxiv.org/html/2605.12792#bib.bib4)\)found that NTGA performs better than the error\-maximizing approach\. To further investigate, we conducted our own comparison by training a ResNet50 model on the CIFAR\-10 dataset protected by various data protection methods\. Refer to Appendix[6\.2](https://arxiv.org/html/2605.12792#S6.SS2)for further details on the experimental settings\. The results are presented in Table[2](https://arxiv.org/html/2605.12792#S3.T2)\. Bold values indicate test accuracies lower than those achieved by NTGA, implying that those methods perform better\. Based on these results, NTGA outperforms both the error\-maximizing and autoregressive approaches\. However, these conclusions are highly dependent on the dataset and model architecture, and a comprehensive analysis is necessary to identify their performance under different circumstances\.
Another challenge of NTGA, as pointed out by Sandoval\-Segura et al\.\(Seguraet al\.,[2022](https://arxiv.org/html/2605.12792#bib.bib7)\), is that NTGA take a long time to generate the perturbations or noises, making it difficult to apply NTGA to large datasets\. This is caused by involving a surrogate model when crafting perturbations\. Sadasivan et al\.\(Sadasivanet al\.,[2023a](https://arxiv.org/html/2605.12792#bib.bib70)\)showed that the NTGA took 5\.2 hours to generate noises while error\-maximizing attacks took only about 0\.5 hours under the same experimental settings\.
### 3\.2\.Challenges against NTGA
#### 3\.2\.1\.Adversarial Training
In this section, we discuss the performance of NTGA against adversarial training\(Madryet al\.,[2017](https://arxiv.org/html/2605.12792#bib.bib15)\)\. Fu et al\.\(Fuet al\.,[2022](https://arxiv.org/html/2605.12792#bib.bib4)\)shows that major data protection approaches are challenged by adversarial training\. Especially after realizing that error\-minimizing attacks can be defeated by adversarial training, Fu et al\.\(Fuet al\.,[2022](https://arxiv.org/html/2605.12792#bib.bib4)\)proposed an extended version of error\-minimizing attacks calledthe Robust Error\-Minimizing \(REM\)attack that is robust to adversarial training\. Their study demonstrates that adversarial training cannot defeat REM attacks while other baseline attacks fail to do so\. Their baseline attacks include the NTGA, the error\-minimizing attack, and the error\-maximizing attack\.
The behavior of NTGA under adversarial training can be further understood by Tao et al\.\(Taoet al\.,[2022](https://arxiv.org/html/2605.12792#bib.bib3)\)which introduces the stability attack, a clean\-label attack that challenges the test\-time robustness of adversarial training\. According to the results in Tao et al\.\(Taoet al\.,[2022](https://arxiv.org/html/2605.12792#bib.bib3)\), we can observe that the an adversarially trained model on NTGA gives almost the same test accuracy as an adversarially trained model clean data after\. These results implies that adversarial training defeats the protection given by the NTGA\. Hence, it confirms the results of Fu et al\.\(Fuet al\.,[2022](https://arxiv.org/html/2605.12792#bib.bib4)\): the NTGA cannot protect data against adversarial training\. However, data poisoning attacks such as One\-Pixel Shortcut\(Wuet al\.,[2022](https://arxiv.org/html/2605.12792#bib.bib47)\), ADVIN attacks\(Wanget al\.,[2021](https://arxiv.org/html/2605.12792#bib.bib31)\), REM\(Fuet al\.,[2022](https://arxiv.org/html/2605.12792#bib.bib4)\), Fang et al\.\(Fanget al\.,[2024](https://arxiv.org/html/2605.12792#bib.bib101)\)have been able to challenge adversarial training\. Furthermore, Sadasivan et al\.\(Sadasivanet al\.,[2023b](https://arxiv.org/html/2605.12792#bib.bib57)\)introduced a data protection approach called, Filter\-based UNlearnable \(FUN\), which provides evidence that the protection, given by the NTGA, the error\-minimizing attacks, and the error\-maximizing attacks on the CIFAR\-10 dataset, can be eliminated using adversarial training with a perturbation radius of 4/255\. Moreover, Sadasivan et al\.\(Sadasivanet al\.,[2023a](https://arxiv.org/html/2605.12792#bib.bib70)\)introduced a convolution\-based data protection approach \(CUDA\), which is a faster and more effective approach against adversarial training\. In that study, they confirmed that the NTGA, as well as error\-minimizing, and error\-maximizing attacks can be defeated by adversarial training with a perturbation of 4/255\.
As we mentioned in Section[2](https://arxiv.org/html/2605.12792#S2), adversarial training is initially designed against adversarial attacks\. That means adversarial training is supposed to defend the models against attacks that occur at test time\. Tao et al\.\(Taoet al\.,[2022](https://arxiv.org/html/2605.12792#bib.bib3)\)investigated the effectiveness of adversarial training conducted on protected data under test\-time attacks\. In other words, they observed the test time robustness of adversarially trained models on protected data\. First, Tao et al\.\(Taoet al\.,[2022](https://arxiv.org/html/2605.12792#bib.bib3)\)conducted adversarial training \(PGD\-AT\(Madryet al\.,[2017](https://arxiv.org/html/2605.12792#bib.bib15)\)\) on data protected by different approaches such as the DeepConfuse, the NTGA and the error\-maximizing attack\. Then, they create test datasets with different noises to evaluate the test robustness of adversarially trained models\. Test\-time adversarial attacks such as PGD\(Madryet al\.,[2017](https://arxiv.org/html/2605.12792#bib.bib15)\), FGSM\(Huanget al\.,[2017](https://arxiv.org/html/2605.12792#bib.bib25)\), and C&W\(Carlini and Wagner,[2017](https://arxiv.org/html/2605.12792#bib.bib26)\)are used to generate noises\. We noticed that adversarial training conducted on clean data gives the test accuracy of around 50% against adversarial attacks\. However, adversarial training, conducted on data protected by the NTGA, results in slightly less test accuracies than the clean data\. The reduction in the robust accuracies implies that the NTGA degrades the test\-robustness of adversarial training slightly\. However, DeepConfuse and the error\-maximizing attacks are mitigating test\-robustness better than the NTGA\. The stability attack proposed by Tao et al\.\(Taoet al\.,[2022](https://arxiv.org/html/2605.12792#bib.bib3)\)outperforms all the clean\-label generalization attacks considered in terms of degrading the test\-robustness of adversarial training\.
In Yuan & Wu\(Yuan and Wu,[2021](https://arxiv.org/html/2605.12792#bib.bib1)\), they have not tested NTGA against adversarial training\.The information discussed in the preceding three paragraphs leads to the conclusion that NTGA may be vulnerable to adversarial training\. To verify this suspicion, we have conducted adversarial training on CIFAR\-10 datasets protected by eight different data protection approaches\. The base model used was VGG19, adversarially trained using PGD attacks with a perturbation radius of 4/255 and a step size of 0\.8/255\. The resulting test accuracies are presented in Table[3](https://arxiv.org/html/2605.12792#S3.T3)\. A clean dataset typically yields a test accuracy of around 85%\. As shown in Table[3](https://arxiv.org/html/2605.12792#S3.T3), NTGA, DeepConfuse, Error\-Minimizing, Error\-Maximizing, and Autoregressive approaches have reached similar accuracy levels, indicating their vulnerability to adversarial training\. Moreover, our experiments have shown that One\-Pixel Shortcut and REM are more robust to adversarial training\. Fig\.[4](https://arxiv.org/html/2605.12792#S3.F4)clearly illustrates the robustness of One\-Pixel Shortcut and REM to adversarial training using the test accuracy curves obtained during adversarial training\. This highlights a potential direction for improving NTGA, as adversarial training remains a significant challenge\. For instance, the REM approach generates noise using an adversarially trained model\. Similarly, NTGA could be enhanced by incorporating an adversarially trained surrogate model instead of relying on standard training\. Additionally, shortcut learning techniques from the One\-Pixel Shortcut method, where only one pixel was modified per image class, could be integrated to further strengthen NTGA against adversarial attacks\.
Table 3\.Test accuracy of a VGGG19 model trained on CIFAR\-10 dataset protected by each data protection approaches\.Figure 4\.Test accuracy curves when adversarial training is conducted on protected CIFAR\-10 datasets\.
#### 3\.2\.2\.Data Augmentation
It is a known fact that strong data augmentation can defeat data poisoning attacks\(Liuet al\.,[2023](https://arxiv.org/html/2605.12792#bib.bib71),[2021b](https://arxiv.org/html/2605.12792#bib.bib51)\)\. Huang et al\.\(Huanget al\.,[2021](https://arxiv.org/html/2605.12792#bib.bib12)\)claimed that the error\-minimizing attacks is robust to strong data augmentation techniques, such as CutMix\(Yunet al\.,[2019](https://arxiv.org/html/2605.12792#bib.bib54)\), Cutout\(Devries and Taylor,[2017](https://arxiv.org/html/2605.12792#bib.bib59)\)and mixup\(Zhanget al\.,[2018](https://arxiv.org/html/2605.12792#bib.bib53)\)\. However, Liu et al\.\(Liuet al\.,[2021b](https://arxiv.org/html/2605.12792#bib.bib51)\)showed that the effect of error\-minimizing attacks can be mitigated by simple grayscale pre\-filtering\. Moreover, Borgnia et al\.\(Borgniaet al\.,[2021](https://arxiv.org/html/2605.12792#bib.bib52)\)diminished the effect of backdoor attacks\(Chenet al\.,[2017](https://arxiv.org/html/2605.12792#bib.bib24)\)using mixup\(Zhanget al\.,[2018](https://arxiv.org/html/2605.12792#bib.bib53)\)and CutMix\(Yunet al\.,[2019](https://arxiv.org/html/2605.12792#bib.bib54)\)data augmentation techniques\. Hence, data poisoning attacks have unpredictable reactions to data augmentation which needs to be explored further\. Moreover, certain data augmentation techniques used for image compression such as JPEG compression, SHIELD\(Daset al\.,[2018](https://arxiv.org/html/2605.12792#bib.bib63)\)provide defense against adversarial attacks\(Qiuet al\.,[2020](https://arxiv.org/html/2605.12792#bib.bib62)\)\(Zenget al\.,[2020](https://arxiv.org/html/2605.12792#bib.bib61)\)\.
Liu et al\.\(Liuet al\.,[2023](https://arxiv.org/html/2605.12792#bib.bib71)\)reveals that many data protection approaches including NTGA are also vulnerable to JPEG compression and grayscale transformations\. A recent study\(Hapuarachchiet al\.,[2024](https://arxiv.org/html/2605.12792#bib.bib84)\)showed that some clean\-label generalization attacks, including the NTGA, can be defeated by nonlinear transformations\. Once a training dataset is increased based on simple yet effective nonlinear transformations techniques, the NTGA can no longer protect the data\. Unlike Fu et al\.\(Fuet al\.,[2022](https://arxiv.org/html/2605.12792#bib.bib4)\), they did not incorporate any adversarial training techniques\. In Hapuarachchi et al\.\(Hapuarachchiet al\.,[2024](https://arxiv.org/html/2605.12792#bib.bib84)\), we used data augmentation techniques such as erode\(Chudasamaet al\.,[2015](https://arxiv.org/html/2605.12792#bib.bib65)\), dilate\(Chudasamaet al\.,[2015](https://arxiv.org/html/2605.12792#bib.bib65)\), color channel manipulation, and Gaussian blur\(Gedraite and Hadad,[2011](https://arxiv.org/html/2605.12792#bib.bib64)\)\. However, as shown in their study, this weakness is common to all clean\-label generalization attacks described in Section[2\.3](https://arxiv.org/html/2605.12792#S2.SS3), not only NTGA\.On the other hand, Sandoval\-Segura et al\.\(Seguraet al\.,[2023](https://arxiv.org/html/2605.12792#bib.bib91)\)performed linear transformations to attack data protected by the availability\-based methods, including NTGA\. They captured the added linearly separable perturbations and attempted to remove them from the data\. Their approach improved test accuracy on the ResNet18 model trained on the NTGA\-perturbed CIFAR\-10 dataset from 40\.78% to 82\.21%, showing that even linear transformations can effectively compromise NTGA\.
Dolatabadi et al\.\(Dolatabadiet al\.,[2023](https://arxiv.org/html/2605.12792#bib.bib72)\)proposed a new mechanism to defeat data protection approaches calleddAta aVailAbiliTy Attacks defuseR\(AVATAR\) that was inspired by data augmentation\. They used Gaussian noise over the protected training data and conducted reverse diffusion using a pre\-trained diffusion model to remove the perturbations in the data\. In the study, they showed that data protection approaches such as NTGA, the error\-minimizing attack, and the error\-maximizing attack are vulnerable toAVATAR\. TheAVATARdid not incorporate the data augmentation techniques used by Hapuarachchi et al\.\(Hapuarachchiet al\.,[2024](https://arxiv.org/html/2605.12792#bib.bib84)\)\. However, Dolatabadi et al\.\(Dolatabadiet al\.,[2023](https://arxiv.org/html/2605.12792#bib.bib72)\)showed that their approach performs better than the data augmentation techniques, such as Cutout, CutMix, and mixup\. Moreover, Yu et al\.\(Yuet al\.,[2024](https://arxiv.org/html/2605.12792#bib.bib99)\)proposed a method to remove perturbations introduced by data protection approaches, including NTGA, using autoencoders\. They claim that their approach outperforms AVATAR in attacking NTGA\.
Countermeasures for data augmentation\-based attacks have also been explored in the literature, e\.g\.,\(Liuet al\.,[2023](https://arxiv.org/html/2605.12792#bib.bib71); Gonget al\.,[2025](https://arxiv.org/html/2605.12792#bib.bib90)\)\. Gong et al\.\(Gonget al\.,[2025](https://arxiv.org/html/2605.12792#bib.bib90)\)proposed a method to generate perturbations that are robust against data augmentation\. The perturbation generation process involves a carefully designed surrogate model and an augmentation selection strategy\. Additionally, Liu et al\.\(Liuet al\.,[2023](https://arxiv.org/html/2605.12792#bib.bib71)\)applied adaptive poisoning to error\-minimizing and Synthetic attacks to counter the effects of data augmentation techniques such as grayscale\. However, they did not consider NTGA, which we identify as a gap worth exploring in future to develop more robust attacks against data augmentation\.
In the above analysis, we observed that grayscale and a similar transformation known as color channel manipulation significantly affected the performance of the data protection approaches including NTGA\(Liuet al\.,[2021b](https://arxiv.org/html/2605.12792#bib.bib51),[2023](https://arxiv.org/html/2605.12792#bib.bib71); Hapuarachchiet al\.,[2024](https://arxiv.org/html/2605.12792#bib.bib84)\)\. Similarly, identifying a specific transformation that strongly impacts perturbations could be valuable for enhancing the robustness of the NTGA\. Although transformation techniques have been widely investigated, prior work has not focused on pinpointing a single definitive transformation with this effect\. A promising starting point is the transformation proposed in\([84](https://arxiv.org/html/2605.12792#bib.bib103)\), which closely resembles the grayscale transformation\. This method implements the transformation by applying matrix multiplication to the image data\. Our objective is to identify a transformation matrix that most effectively improves the accuracy of the test\. Figure[5](https://arxiv.org/html/2605.12792#S3.F5)illustrates how such a transformation can be applied to NTGA\-perturbed data\. The variablesR,G, andBrepresent the red, green, and blue channel values of each pixel, respectively, whileDis a dummy variable introduced to enhance the transformation\. As shown in Figure[5](https://arxiv.org/html/2605.12792#S3.F5), we modified the transformation matrix with the goal of discovering an optimal configuration that can neutralize NTGA perturbations and test accuracy\. Identifying a transformation that consistently disrupts NTGA perturbations would represent a significant advancement toward more robust data protection methods\.
Figure 5\.Matrix transformations on NTGA\-perturbed data\. The variablesR,G, andBrepresent the red, green, and blue channel values of each pixel, respectively, whileDis a dummy variable introduced to enhance the transformation\.
### 3\.3\.Linearly Separability of NTGA
Yu et al\.\(Yuet al\.,[2021](https://arxiv.org/html/2605.12792#bib.bib5)\)unveiled that perturbations generated by clean\-label generalization attacks, including the NTGA, are linearly separable\. They considered the DeepConfuse attack, the error\-minimizing attack, the error\-maximizing attack, and the NTGA to prove their argument\. First, Yu et al\.\(Yuet al\.,[2021](https://arxiv.org/html/2605.12792#bib.bib5)\)created two\-dimensional t\-SNE plots of perturbations and clean data\. The t\-SNE plot is a statistical method to display high\-dimensional data points in a two or three\-dimensional plot\(van der Maaten and Hinton,[2008](https://arxiv.org/html/2605.12792#bib.bib14)\)\. Surprisingly, they observed that perturbations with a same class are clustered together, and clean data did not exhibit such a pattern\. Sadasivan et al\.\(Sadasivanet al\.,[2023b](https://arxiv.org/html/2605.12792#bib.bib57)\)also observed a similar pattern in the t\-SNE plot of REM\(Fuet al\.,[2022](https://arxiv.org/html/2605.12792#bib.bib4)\)perturbations\. Furthermore, Yu et al\.\(Yuet al\.,[2021](https://arxiv.org/html/2605.12792#bib.bib5)\)confirmed the linear separability by fitting linear models on perturbations\. Additionally, they showed that simple logistic regression models between target classes and perturbations can be fitted with more than 90% training accuracy\. They further proved their argument using a one\-layer \(linear\) neural network and a two\-layer neural network to classify perturbations\. The linear model achieved more than 90% training accuracy, while the two\-layer network gave nearly 100% training accuracy for all attacks considered\. Hence, these results confirmed that all clean\-label generalization attacks considered are linearly separable\.
Table 4\.Training accuracy of the logistic regression model fitted on CIFAR\-10 datasets\.Furthermore, as demonstrated by Yu et al\.\(Yuet al\.,[2021](https://arxiv.org/html/2605.12792#bib.bib5)\), linear separability is a sufficient condition for availability attacks to succeed\. They proved their argument by using the simple synthetic data as perturbations\. First, they generated synthetic perturbations and further processed them to defend image augmentations and showed that synthetic perturbations outperformed most of the attacks\. Hence, Yu et al\.\(Yuet al\.,[2021](https://arxiv.org/html/2605.12792#bib.bib5)\)have confirmed that the linear separability is a sufficient condition for generalization attacks\.However, Sandoval\-Segura et al\.\(Seguraet al\.,[2023](https://arxiv.org/html/2605.12792#bib.bib91)\)challenged the notion that data protected by data protection approaches are effective due to linearly separable perturbations\. They presented autoregressive perturbations\(Seguraet al\.,[2022](https://arxiv.org/html/2605.12792#bib.bib7)\)as a counterexample, which do not exhibit linear separability, unlike other attacks such as NTGA that exhibits the linear separability property\. To investigate this property, they trained a linear logistic regression model on the perturbations to assess their linearly separability\. Furthermore, Zhu et al\.\(Zhuet al\.,[2024](https://arxiv.org/html/2605.12792#bib.bib95)\)confirmed that NTGA is indeed linearly separable and developed a detection algorithm capable of identifying data protected by availability attacks based on this property\.
In the aforementioned studies, researchers analyzed the linear separability of perturbations, not of the perturbed images\. For instance, Yu et al\.\(Yuet al\.,[2021](https://arxiv.org/html/2605.12792#bib.bib5)\)created t\-SNE plots on perturbations using a linear \(one\-layer\) neural network, while Sandoval\-Segura et al\.\(Seguraet al\.,[2023](https://arxiv.org/html/2605.12792#bib.bib91)\)applied logistic regression models on perturbations\. However, none of these studies evaluated the linear separability of the perturbed images themselves\. In a realistic setting, an unauthorized user may only have access to perturbed data\. Such a user could attempt to exploit the linear separability property of the perturbed data, but they cannot directly extract the perturbations since they lack access to the clean data\. To study this scenario, we conducted experiments to evaluate whether perturbed images exhibit linear separability\. Specifically, we fit a logistic regression model with the L\-BFGS solver on the CIFAR\-10 dataset, using the images and their corresponding class labels\. The training accuracies of the fitted models are reported in Table[4](https://arxiv.org/html/2605.12792#S3.T4)\. Our results show that images perturbed with NTGA, DeepConfuse, error\-minimizing attack, and One\-Pixel Shortcut exhibit strong linear relationships with the class labels, while Error\-Maximizing and REM perturbations show moderate relationships\. In contrast, confirming the findings of Sandoval\-Segura et al\.\(Seguraet al\.,[2023](https://arxiv.org/html/2605.12792#bib.bib91)\), autoregressive perturbations do not display such linear separability; their training accuracy is nearly identical to that of clean data\. This indicates that an unauthorized user cannot manipulate or detect autoregressive perturbations through linear separability\. Overall, this analysis suggests that NTGA can be improved by reducing the linear separability of its perturbations\.
### 3\.4\.Other Attacks against DNNs using NTKs
Tsilivis & Kempe,\(Tsilivis and Kempe,[2022](https://arxiv.org/html/2605.12792#bib.bib6)\)proposed an adversarial attack using the NTK, calledNTK perturbations\. Besides NTGA, this is the only prevailing attack involving the NTK\. Tsilivis and Kempe\(Tsilivis and Kempe,[2022](https://arxiv.org/html/2605.12792#bib.bib6)\)showed that NTK perturbations perform almost the same as the PGD attack\(Madryet al\.,[2017](https://arxiv.org/html/2605.12792#bib.bib15)\), one of the most powerful adversarial attacks\. Another advantage of NTK perturbations over NTGA is, providing an analytical expression for the perturbations found using the NTK as follows\(Tsilivis and Kempe,[2022](https://arxiv.org/html/2605.12792#bib.bib6)\)\.
\(9\)ηi=−ϵyi⋅sgn\(AiTH\(X,X\)−1Y\)\\eta\_\{i\}=\-\\epsilon y\_\{i\}\\cdot sgn\(A\_\{i\}^\{T\}H\(X,X\)^\{\-1\}Y\)whereηi\\eta\_\{i\}denotes the perturbation bounded byϵ\\epsilon\.HHrepresents the NTK matrix\.XXandYYindicate the training set and target labels, respectively\.AiA\_\{i\}denotes the matrix including derivatives of the NTK betweenxix\_\{i\}andxjx\_\{j\}for allj=1…nj=1\\dots n\.
It is interesting to notice that both NTGA and NTK perturbations show strong transferability among models\. It means that both NTGA and NTK perturbations derived from a specific model can fool other types of models successfully\. We can suggest that involving NTKs may have contributed to the transferability of their attacks\. Thus, it is worth exploring the possibility of incorporating NTKs in order to create successful black\-box attacks\.
To conclude this section, we provide Table[5](https://arxiv.org/html/2605.12792#S3.T5), which summarizes the pros and cons of the NTGA\.
Table 5\.A summary of the pros and cons of NTGA\.## 4\.Future Research Insights
### 4\.1\.Transferability of NTGA Compared to Other Clean\-label Generalization Attacks
We have noticed that the primary significance of NTGA, among other clean\-label generalization attacks, is being a black\-box attack\. The property of transferability guides the way to a successful black\-box attack\(Liuet al\.,[2016](https://arxiv.org/html/2605.12792#bib.bib28)\)\. The transferability of perturbations means that perturbations derived from a specific model can mislead models with different architectures\. Even though several other data protection approaches outperform the NTGA, more studies need to be conducted to compare the transferability of NTGA with other generalization attacks\. Since black\-box attacks are primarily introduced in adversarial attack settings, testing for transferability can be found mainly in the context of adversarial attacks\(Papernotet al\.,[2016](https://arxiv.org/html/2605.12792#bib.bib30)\)\.
A common approach for testing the transferability of data protection methods is generating perturbed images using a certain model architecture, training a model with a different architecture using those images, and investigating the trained model’s generalization ability\. Furthermore, there are advanced practices for evaluating transferability\. One approach is utilizing machine learning services hosted by Google and Amazon\. Huang et al\.\(Huanget al\.,[2020](https://arxiv.org/html/2605.12792#bib.bib22)\), which proposed a clean\-label data poisoning attack called MetaPoison, examined the transferability using the Google Cloud AutoML API\([33](https://arxiv.org/html/2605.12792#bib.bib83)\)\. Given a dataset, Google Cloud AutoML API\([33](https://arxiv.org/html/2605.12792#bib.bib83)\)allows us to train models in a black\-box setting, hiding the training architecture from the user\. Papernot & McDaniel\(Papernotet al\.,[2017](https://arxiv.org/html/2605.12792#bib.bib19)\)used both Google and Amazon machine learning services to experiment transferability\([59](https://arxiv.org/html/2605.12792#bib.bib82)\)\. Moreover, Google Cloud Vision API\([34](https://arxiv.org/html/2605.12792#bib.bib81)\)is a tool that can be used to evaluate adversarial attacks in a black\-box setting\(Huang and Zhang,[2019](https://arxiv.org/html/2605.12792#bib.bib29)\)\.
We suggest testing the transferability of other clean\-label generalization attacks, such as the error\-maximizing\(Huanget al\.,[2021](https://arxiv.org/html/2605.12792#bib.bib12)\), and the error\-minimizing\(Fowlet al\.,[2021](https://arxiv.org/html/2605.12792#bib.bib10)\)attacks, to emphasize the performance of NTGA compared to others\.As a method for improving transferability, we can also consider the approach proposed by Chen et al\.\(Chenet al\.,[2023](https://arxiv.org/html/2605.12792#bib.bib88)\)\. They generated an attack using model checkpoints that exhibit diverse behaviors, effectively simulating DNNs with different architectures, rather than relying on a single model\.
### 4\.2\.Applications of Clean\-label Generalization Attacks to Real\-world Datasets
Throughout our study, we observed that the generalization attacks protect data effectively based on the experiments conducted using the benchmark datasets, such as the CIFAR\-10\(Krizhevskyet al\.,[2009](https://arxiv.org/html/2605.12792#bib.bib35)\), the ImageNet\(Denget al\.,[2009](https://arxiv.org/html/2605.12792#bib.bib34)\)and the MNIST\(Ciresanet al\.,[2011](https://arxiv.org/html/2605.12792#bib.bib36)\)\. However, the possibility of applying these attacks on real\-life data is yet to be investigated\. Moreover, it is vital to verify that the effectiveness of these attacks prevails on the real\-world datasets\. Given that the real\-world data are more complicated than the standard datasets, exploring the compatibility of these attacks under realistic settings is essential\. Huang et al\.\(Huanget al\.,[2021](https://arxiv.org/html/2605.12792#bib.bib12)\)provided such a demonstration of their attack on the WebFace dataset\(Yiet al\.,[2014](https://arxiv.org/html/2605.12792#bib.bib38)\), a widely used face recognition dataset\. Furthermore, Wang et al\.\(Wanget al\.,[2021](https://arxiv.org/html/2605.12792#bib.bib31)\)demonstrated a real\-world application of the ADVIN attack on the same dataset\. Shan et al\.\(Shanet al\.,[2020](https://arxiv.org/html/2605.12792#bib.bib80)\), proposed a data protection approach against the facial recognition DNN models based on the WebFace\(Yiet al\.,[2014](https://arxiv.org/html/2605.12792#bib.bib38)\)and the VGGFace2\(Caoet al\.,[2018](https://arxiv.org/html/2605.12792#bib.bib85)\)datasets\.
Moreover, it would be beneficial to provide guidelines on applying these attacks in real\-life data so that any data owner can easily utilize them to protect their data\. Ma et al\.\(Maet al\.,[2020](https://arxiv.org/html/2605.12792#bib.bib33)\)extended his study to provide a visual analytics framework to show model vulnerabilities to data poisoning attacks\. In the literature, we can observe that some adversarial attacks are evaluated in real\-world data\(Kurakinet al\.,[2017](https://arxiv.org/html/2605.12792#bib.bib37)\)\(Sharifet al\.,[2016](https://arxiv.org/html/2605.12792#bib.bib39)\)\. For instance, Kurakin et al\.\(Kurakinet al\.,[2017](https://arxiv.org/html/2605.12792#bib.bib37)\)examined the vulnerability of adversarial images obtained from mobile phone cameras\. However, there is a lack of data poisoning attacks that are being evaluated under realistic scenarios\. The concept used for adversarial attacks must be restructured for data poisoning attacks since they are executed during training, not in testing\.Additionally, researchers can leverage APBench\(Qinet al\.,[2023](https://arxiv.org/html/2605.12792#bib.bib94)\), which offers accessible implementations of availability attacks and establishes a unified benchmark to support future research\. By using APBench, inconsistencies in coding environments across different datasets and experimental setups can be mitigated, enabling more standardized and reproducible evaluations of various attack methods\.
### 4\.3\.NTGA on Distributed Machine Learning
Conventional machine learning algorithms are not efficient enough to handle the current rapid growth in data\. Training larger datasets requires an enormous the number of parameters that is impossible to control with available computing power\. However, given the fact that increasing data reduces the learning error, it is necessary to find a way to deal with large\-scale data\. Distributed machine learning is an approach that can handle extensive datasets analysis efficiently\. Distributed machine learning algorithms are implemented on multiple nodes, which allows handing larger input datasets\(Galakatoset al\.,[2018](https://arxiv.org/html/2605.12792#bib.bib42)\)\(Verbraekenet al\.,[2020](https://arxiv.org/html/2605.12792#bib.bib43)\)\. Since these approaches are prevalent, it is interesting to explore how NTGA perform under distributed machine learning\. We can focus on the problem:Can NTGA protect data against distributed learning algorithms?for future research\. Several studies have been conducted on data poisoning attacks under distributed machine learning in the past\(Tianet al\.,[2021](https://arxiv.org/html/2605.12792#bib.bib44)\)\(Tomsettet al\.,[2019](https://arxiv.org/html/2605.12792#bib.bib45)\)\(Funget al\.,[2018](https://arxiv.org/html/2605.12792#bib.bib46)\)\. They mainly involve federated learning\(Tianet al\.,[2021](https://arxiv.org/html/2605.12792#bib.bib44)\), a popular variant of distributed machine learning\.
### 4\.4\.NTGA on Unsupervised Learning Models
It is important to notice that NTGA are crafted and tested focusing only supervised learning algorithms\. However, we cannot predict the unauthorized user’s choice of the target model\. Since unsupervised learning is also popular as supervised learning in deep learning applications, there is a valid chance that data would be trained on unsupervised learning models\. Being a black\-box attack, it is interesting to investigate how NTGA perform on unsupervised learning methods\. Moreover, He et al\.\(Heet al\.,[2022](https://arxiv.org/html/2605.12792#bib.bib49)\)motivates us to explore this research direction further\. They showed that the error\-minimizing attack\(Huanget al\.,[2021](https://arxiv.org/html/2605.12792#bib.bib12)\)and the error\-maximizing attack\(Fowlet al\.,[2021](https://arxiv.org/html/2605.12792#bib.bib10)\)could not protect data against unsupervised contrastive learning\(Chopraet al\.,[2005](https://arxiv.org/html/2605.12792#bib.bib50)\)models, which is one of the most powerful unsupervised learning methods\.Moreover, Wang et al\.\(Wanget al\.,[2024b](https://arxiv.org/html/2605.12792#bib.bib93)\)showed that NTGA were ineffective in mitigating the performance of the SimCLR model\(Chenet al\.,[2020](https://arxiv.org/html/2605.12792#bib.bib102)\), a contrastive learning framework\.In the literature, it is vital to notice that few studies have been done on data protection methods against unsupervised learning methods\(Heet al\.,[2022](https://arxiv.org/html/2605.12792#bib.bib49)\)\.
### 4\.5\.NTGA on Ensemble Learning Algorithms
Another aspect we can explore is the behavior of NTGA under the ensemble learning\. Ensemble learning\(Donget al\.,[2020](https://arxiv.org/html/2605.12792#bib.bib76)\)methods utilize multiple learning algorithms with weak predictive results and combine results using voting mechanisms to achieve better performances than that obtained from any traditional algorithm alone\. Traditional ensemble learning mechanisms include stacking, boosting, and bagging\(Donget al\.,[2020](https://arxiv.org/html/2605.12792#bib.bib76)\)\. In recent years, ensemble learning mechanisms have dominated machine learning applications, for example, the medical data analysis and the fraud detection\(Wenet al\.,[2017](https://arxiv.org/html/2605.12792#bib.bib77)\)\(Sagi and Rokach,[2018](https://arxiv.org/html/2605.12792#bib.bib78)\)\(Xiaoet al\.,[2018](https://arxiv.org/html/2605.12792#bib.bib79)\)\. Because of the impressive performance in ensemble learning methods, it is essential to explore whether NTGA can protect data against such advanced machine learning methods\. Hapuarachchi et al\.\(Hapuarachchi and Xiong,[2025](https://arxiv.org/html/2605.12792#bib.bib86)\)conducted a comprehensive analysis of leveraging ensemble learning with data generated through generalization attacks, including NTGA\. They demonstrated that, with slight modifications, ensemble learning can be successfully applied on data protected by generalization attacks, revealing its vulnerabilities\. We suggest investigating ways to improve NTGA’s robustness against such advanced ensemble learning algorithms\. For instance, instead of relying on a single surrogate model when generating perturbations, NTGA could incorporate ensemble machine learning algorithms\.
### 4\.6\.NTGA for Other Data Types
NTGA has primarily been introduced for image data So far\. However, data protection approaches are now being extended to other domains such as text and audio\(Liuet al\.,[2024a](https://arxiv.org/html/2605.12792#bib.bib89); Liet al\.,[2023](https://arxiv.org/html/2605.12792#bib.bib96); Wanget al\.,[2024a](https://arxiv.org/html/2605.12792#bib.bib100)\)\. Given the growing prominence and popularity of Natural Language Processing \(NLP\) and text\-based applications, exploring how these methods can be adapted to protect textual data is extremely important\. Some studies have already begun investigating this direction\(Liuet al\.,[2024a](https://arxiv.org/html/2605.12792#bib.bib89); Liet al\.,[2023](https://arxiv.org/html/2605.12792#bib.bib96)\)\. For instance, Liu et al\.\(Liuet al\.,[2024a](https://arxiv.org/html/2605.12792#bib.bib89)\)applied error\-minimizing attacks to both images and their associated captions, aiming to protect multimodal content\. The potential novelty of applying NTGA in this context lies in their black\-box attack capabilities\. Similarly, Li et al\.\(Liet al\.,[2023](https://arxiv.org/html/2605.12792#bib.bib96)\)adopted error\-minimizing attacks for text data\. Additionally, Gokul et al\.\(Gokul and Dubnov,[2024](https://arxiv.org/html/2605.12792#bib.bib97)\)explored unlearnable examples in the audio domain using the CUDA\(Sadasivanet al\.,[2023a](https://arxiv.org/html/2605.12792#bib.bib70)\)framework\.
## 5\.Conclusion
NTGA is undoubtedly a vital discovery in the area of clean\-label generalization attacks\. After seven years of proposing the attacks, this paper analyzes the current status of the NTGA among other clean\-label generalization attacks\. We have noticed that adversarial training and simple image transformations mitigate the protection given by NTGA, which is a typical case for other generalization attacks as well\. Moreover, we have observed that major clean\-label generalization attacks outperform NTGA in several cases\. However, given that it is the only prevailing black\-box clean\-label generalization attack, it is worth to study for further improvement\. Hence, we provide several insights for future research on NTGA based on the areas that have yet to explore\.
## References
- I\. M\. Ahmed and M\. Y\. Kashmoola \(2021\)Threats on machine learning technique by data poisoning attack: a survey\.InInternational Conference on Advances in Cyber Security,pp\. 586–600\.Cited by:[§1](https://arxiv.org/html/2605.12792#S1.tab1.4)\.
- A\. N\. Bhagoji, W\. He, B\. Li, and D\. Song \(2017\)Exploring the space of black\-box attacks on deep neural networks\.CoRRabs/1712\.09491\.External Links:[Link](http://arxiv.org/abs/1712.09491),1712\.09491Cited by:[§2\.5](https://arxiv.org/html/2605.12792#S2.SS5.p1.1)\.
- B\. Biggio, B\. Nelson, and P\. Laskov \(2012\)Poisoning attacks against support vector machines\.InProceedings of the 29th International Conference on Machine Learning \(ICML\),External Links:[Link](http://icml.cc/2012/papers/880.pdf)Cited by:[§2\.1](https://arxiv.org/html/2605.12792#S2.SS1.p2.1)\.
- E\. Borgnia, V\. Cherepanova, L\. Fowl, A\. Ghiasi, J\. Geiping, M\. Goldblum, T\. Goldstein, and A\. Gupta \(2021\)Strong data augmentation sanitizes poisoning and backdoor attacks without an accuracy tradeoff\.InIEEE International Conference on Acoustics, Speech and Signal Processing \(ICASSP\), Toronto, ON, Canada,pp\. 3855–3859\.External Links:[Link](https://doi.org/10.1109/ICASSP39728.2021.9414862),[Document](https://dx.doi.org/10.1109/ICASSP39728.2021.9414862)Cited by:[§3\.2\.2](https://arxiv.org/html/2605.12792#S3.SS2.SSS2.p1.1)\.
- Q\. Cao, L\. Shen, W\. Xie, O\. M\. Parkhi, and A\. Zisserman \(2018\)VGGFace2: A dataset for recognising faces across pose and age\.In13th IEEE International Conference on Automatic Face & Gesture Recognition, Xi’an, China,pp\. 67–74\.External Links:[Link](https://doi.org/10.1109/FG.2018.00020),[Document](https://dx.doi.org/10.1109/FG.2018.00020)Cited by:[§4\.2](https://arxiv.org/html/2605.12792#S4.SS2.p1.1)\.
- N\. Carlini and D\. A\. Wagner \(2017\)Towards evaluating the robustness of neural networks\.In2017 IEEE Symposium on Security and Privacy \(SP\),pp\. 39–57\.External Links:[Link](https://doi.org/10.1109/SP.2017.49),[Document](https://dx.doi.org/10.1109/SP.2017.49)Cited by:[§2\.1](https://arxiv.org/html/2605.12792#S2.SS1.p3.1),[§3\.2\.1](https://arxiv.org/html/2605.12792#S3.SS2.SSS1.p3.1)\.
- A\. Chan\-Hon\-Tong \(2019\)An algorithm for generating invisible data poisoning using adversarial noise that breaks image classification deep learning\.Mach\. Learn\. Knowl\. Extr\.1\(1\),pp\. 192–204\.External Links:[Link](https://doi.org/10.3390/make1010011),[Document](https://dx.doi.org/10.3390/make1010011)Cited by:[§1](https://arxiv.org/html/2605.12792#S1.tab1.6),[Definition 2\.1](https://arxiv.org/html/2605.12792#S2.Thmtheorem1.p1.1),[§3\.1](https://arxiv.org/html/2605.12792#S3.SS1.p1.1)\.
- S\. Chen, G\. Yuan, X\. Cheng, Y\. Gong, M\. Qin, Y\. Wang, and X\. Huang \(2023\)Self\-ensemble protection: training checkpoints are good data protectors\.InThe Eleventh International Conference on Learning Representations, ICLR, Kigali, Rwanda,Cited by:[§4\.1](https://arxiv.org/html/2605.12792#S4.SS1.p3.1.1)\.
- T\. Chen, S\. Kornblith, M\. Norouzi, and G\. Hinton \(2020\)A simple framework for contrastive learning of visual representations\.InInternational conference on machine learning,pp\. 1597–1607\.Cited by:[§4\.4](https://arxiv.org/html/2605.12792#S4.SS4.p1.1.1)\.
- X\. Chen, Y\. Xu, S\. Zhang, J\. Yan, W\. Xu, and X\. He \(2025\)EUN: enhanced unlearnable examples generation approach for privacy protection\.Comput\. Vis\. Image Underst\.258,pp\. 104388\.External Links:[Link](https://doi.org/10.1016/j.cviu.2025.104388)Cited by:[§1](https://arxiv.org/html/2605.12792#S1.tab1.1.3.2.3.1.1),[§3\.1](https://arxiv.org/html/2605.12792#S3.SS1.p2.1.1)\.
- X\. Chen, C\. Liu, B\. Li, K\. Lu, and D\. Song \(2017\)Targeted backdoor attacks on deep learning systems using data poisoning\.CoRRabs/1712\.05526\.External Links:[Link](http://arxiv.org/abs/1712.05526),1712\.05526Cited by:[§2\.5](https://arxiv.org/html/2605.12792#S2.SS5.p1.1),[§3\.2\.2](https://arxiv.org/html/2605.12792#S3.SS2.SSS2.p1.1)\.
- S\. Chopra, R\. Hadsell, and Y\. LeCun \(2005\)Learning a similarity metric discriminatively, with application to face verification\.In2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition \(CVPR\),pp\. 539–546\.External Links:[Link](https://doi.org/10.1109/CVPR.2005.202),[Document](https://dx.doi.org/10.1109/CVPR.2005.202)Cited by:[§4\.4](https://arxiv.org/html/2605.12792#S4.SS4.p1.1)\.
- D\. Chudasama, T\. Patel, S\. Joshi, and G\. Prajapati \(2015\)Image segmentation using morphological operations\.International Journal of Computer Applications117,pp\. 16–19\.External Links:[Document](https://dx.doi.org/10.5120/20654-3197)Cited by:[§3\.2\.2](https://arxiv.org/html/2605.12792#S3.SS2.SSS2.p2.1)\.
- D\. C\. Ciresan, U\. Meier, J\. Masci, L\. M\. Gambardella, and J\. Schmidhuber \(2011\)High\-performance neural networks for visual object classification\.CoRRabs/1102\.0183\.External Links:[Link](http://arxiv.org/abs/1102.0183),1102\.0183Cited by:[§4\.2](https://arxiv.org/html/2605.12792#S4.SS2.p1.1)\.
- N\. Das, M\. Shanbhogue, S\. Chen, F\. Hohman, S\. Li, L\. Chen, M\. E\. Kounavis, and D\. H\. Chau \(2018\)SHIELD: fast, practical defense and vaccination for deep learning using JPEG compression\.InProceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining \(KDD\), London, UK,Y\. Guo and F\. Farooq \(Eds\.\),pp\. 196–204\.External Links:[Link](https://doi.org/10.1145/3219819.3219910),[Document](https://dx.doi.org/10.1145/3219819.3219910)Cited by:[§3\.2\.2](https://arxiv.org/html/2605.12792#S3.SS2.SSS2.p1.1)\.
- J\. Deng, W\. Dong, R\. Socher, L\. Li, K\. Li, and L\. Fei\-Fei \(2009\)ImageNet: A large\-scale hierarchical image database\.InIEEE Computer Society Conference on Computer Vision and Pattern Recognition \(CVPR\),pp\. 248–255\.External Links:[Link](https://doi.org/10.1109/CVPR.2009.5206848),[Document](https://dx.doi.org/10.1109/CVPR.2009.5206848)Cited by:[§4\.2](https://arxiv.org/html/2605.12792#S4.SS2.p1.1)\.
- T\. Devries and G\. W\. Taylor \(2017\)Improved regularization of convolutional neural networks with cutout\.CoRRabs/1708\.04552\.External Links:[Link](http://arxiv.org/abs/1708.04552),1708\.04552Cited by:[§3\.2\.2](https://arxiv.org/html/2605.12792#S3.SS2.SSS2.p1.1)\.
- H\. M\. Dolatabadi, S\. M\. Erfani, and C\. Leckie \(2023\)The devil’s advocate: shattering the illusion of unexploitable data using diffusion models\.CoRRabs/2303\.08500\.External Links:[Link](https://doi.org/10.48550/arXiv.2303.08500),[Document](https://dx.doi.org/10.48550/arXiv.2303.08500),2303\.08500Cited by:[§1](https://arxiv.org/html/2605.12792#S1.tab1.1.5.4.3.1.1),[§3\.2\.2](https://arxiv.org/html/2605.12792#S3.SS2.SSS2.p3.1)\.
- X\. Dong, Z\. Yu, W\. Cao, Y\. Shi, and Q\. Ma \(2020\)A survey on ensemble learning\.Frontiers Comput\. Sci\.14\(2\),pp\. 241–258\.External Links:[Link](https://doi.org/10.1007/s11704-019-8208-z),[Document](https://dx.doi.org/10.1007/s11704-019-8208-z)Cited by:[§4\.5](https://arxiv.org/html/2605.12792#S4.SS5.p1.1)\.
- J\. Fan, Q\. Yan, M\. Li, G\. Qu, and Y\. Xiao \(2022\)A survey on data poisoning attacks and defenses\.In7th IEEE International Conference on Data Science in Cyberspace, DSC 2022, Guilin, China, July 11\-13, 2022,pp\. 48–55\.External Links:[Link](https://doi.org/10.1109/DSC55868.2022.00014),[Document](https://dx.doi.org/10.1109/DSC55868.2022.00014)Cited by:[§1](https://arxiv.org/html/2605.12792#S1.tab1.4)\.
- B\. Fang, B\. Li, S\. Wu, S\. Ding, R\. Yi, and L\. Ma \(2024\)Re\-thinking data availability attacks against deep neural networks\.InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,pp\. 12215–12224\.Cited by:[§3\.2\.1](https://arxiv.org/html/2605.12792#S3.SS2.SSS1.p2.1)\.
- J\. Feng, Q\. Cai, and Z\. Zhou \(2019\)Learning to confuse: generating training time adversarial data with auto\-encoder\.InAnnual Conference on Neural Information Processing Systems, \(NeurIPS\),pp\. 11971–11981\.External Links:[Link](https://proceedings.neurips.cc/paper/2019/hash/1ce83e5d4135b07c0b82afffbe2b3436-Abstract.html)Cited by:[§1](https://arxiv.org/html/2605.12792#S1.tab1.6),[§2\.1](https://arxiv.org/html/2605.12792#S2.SS1.p2.1),[§2\.3\.3](https://arxiv.org/html/2605.12792#S2.SS3.SSS3.p1.1),[§2\.3\.3](https://arxiv.org/html/2605.12792#S2.SS3.SSS3.p2.2),[§3\.1](https://arxiv.org/html/2605.12792#S3.SS1.p1.1),[Table 2](https://arxiv.org/html/2605.12792#S3.T2.1.3.2.1),[Table 3](https://arxiv.org/html/2605.12792#S3.T3.1.3.2.1),[Table 4](https://arxiv.org/html/2605.12792#S3.T4.1.4.4.1)\.
- L\. Fowl, M\. Goldblum, P\. Chiang, J\. Geiping, W\. Czaja, and T\. Goldstein \(2021\)Adversarial examples make strong poisons\.InAnnual Conference on Neural Information Processing Systems \(NeurIPS\),pp\. 30339–30351\.External Links:[Link](https://proceedings.neurips.cc/paper/2021/hash/fe87435d12ef7642af67d9bc82a8b3cd-Abstract.html)Cited by:[§2\.1](https://arxiv.org/html/2605.12792#S2.SS1.p2.1),[§2\.3\.1](https://arxiv.org/html/2605.12792#S2.SS3.SSS1.p1.1),[§2\.3\.1](https://arxiv.org/html/2605.12792#S2.SS3.SSS1.p1.2),[Table 2](https://arxiv.org/html/2605.12792#S3.T2.1.5.4.1),[Table 3](https://arxiv.org/html/2605.12792#S3.T3.1.5.4.1),[Table 4](https://arxiv.org/html/2605.12792#S3.T4.1.6.6.1),[§4\.1](https://arxiv.org/html/2605.12792#S4.SS1.p3.1),[§4\.4](https://arxiv.org/html/2605.12792#S4.SS4.p1.1)\.
- S\. Fu, F\. He, Y\. Liu, L\. Shen, and D\. Tao \(2022\)Robust unlearnable examples: protecting data privacy against adversarial learning\.InThe Tenth International Conference on Learning Representations \(ICLR\),External Links:[Link](https://openreview.net/forum?id=baUQQPwQiAg)Cited by:[§1](https://arxiv.org/html/2605.12792#S1.tab1.1.3.2.3.1.1),[§1](https://arxiv.org/html/2605.12792#S1.tab1.1.4.3.3.1.1),[§2\.4](https://arxiv.org/html/2605.12792#S2.SS4.p1.2),[§3\.1](https://arxiv.org/html/2605.12792#S3.SS1.p2.1),[§3\.1](https://arxiv.org/html/2605.12792#S3.SS1.p3.1.1),[§3\.2\.1](https://arxiv.org/html/2605.12792#S3.SS2.SSS1.p1.1),[§3\.2\.1](https://arxiv.org/html/2605.12792#S3.SS2.SSS1.p2.1),[§3\.2\.2](https://arxiv.org/html/2605.12792#S3.SS2.SSS2.p2.1),[§3\.3](https://arxiv.org/html/2605.12792#S3.SS3.p1.1),[Table 2](https://arxiv.org/html/2605.12792#S3.T2.1.7.6.1),[Table 3](https://arxiv.org/html/2605.12792#S3.T3.1.7.6.1),[Table 4](https://arxiv.org/html/2605.12792#S3.T4.1.8.8.1)\.
- C\. Fung, C\. J\. M\. Yoon, and I\. Beschastnikh \(2018\)Mitigating sybils in federated learning poisoning\.CoRRabs/1808\.04866\.External Links:[Link](http://arxiv.org/abs/1808.04866),1808\.04866Cited by:[§4\.3](https://arxiv.org/html/2605.12792#S4.SS3.p1.1)\.
- A\. Galakatos, A\. Crotty, and T\. Kraska \(2018\)Distributed machine learning\.\.Cited by:[§4\.3](https://arxiv.org/html/2605.12792#S4.SS3.p1.1)\.
- E\. S\. Gedraite and M\. Hadad \(2011\)Investigation on the effect of a gaussian blur in image filtering and segmentation\.InProceedings ELMAR,pp\. 393–396\.Cited by:[§3\.2\.2](https://arxiv.org/html/2605.12792#S3.SS2.SSS2.p2.1)\.
- J\. Geiping, L\. H\. Fowl, G\. Somepalli, M\. Goldblum, M\. Moeller, and T\. Goldstein \(2022\)What doesn’t kill you makes you robust\(er\): how to adversarially train against data poisoning\.External Links:[Link](https://openreview.net/forum?id=VMuenFh7IpP)Cited by:[§2\.4](https://arxiv.org/html/2605.12792#S2.SS4.p1.2)\.
- V\. Gokul and S\. Dubnov \(2024\)Poscuda: position based convolution for unlearnable audio datasets\.arXiv preprint arXiv:2401\.02135\.Cited by:[§4\.6](https://arxiv.org/html/2605.12792#S4.SS6.p1.1.1)\.
- X\. Gong, Y\. Wang, Y\. Chen, H\. Dong, Y\. Li, M\. Sun, S\. Li, Q\. Wang, and C\. Chen \(2025\)ARMOR: shielding unlearnable examples against data augmentation\.CoRRabs/2501\.08862\.External Links:[Link](https://doi.org/10.48550/arXiv.2501.08862)Cited by:[§3\.2\.2](https://arxiv.org/html/2605.12792#S3.SS2.SSS2.p4.1)\.
- I\. Goodfellow, Y\. Bengio, and A\. Courville \(2016\)Deep learning\.MIT Press\.Note:[http://www\.deeplearningbook\.org](http://www.deeplearningbook.org/)Cited by:[§1](https://arxiv.org/html/2605.12792#S1.tab1.2)\.
- I\. J\. Goodfellow, J\. Shlens, and C\. Szegedy \(2015\)Explaining and harnessing adversarial examples\.In3rd International Conference on Learning Representations \(ICLR\), San Diego, CA, USA,Y\. Bengio and Y\. LeCun \(Eds\.\),External Links:[Link](http://arxiv.org/abs/1412.6572)Cited by:[§2\.1](https://arxiv.org/html/2605.12792#S2.SS1.p3.1),[§2\.3\.1](https://arxiv.org/html/2605.12792#S2.SS3.SSS1.p1.2)\.
- \[33\]Google cloud automl\(Website\)External Links:[Link](https://cloud.google.com/automl)Cited by:[§4\.1](https://arxiv.org/html/2605.12792#S4.SS1.p2.1)\.
- \[34\]Google cloud vision api\(Website\)External Links:[Link](https://cloud.google.com/vision)Cited by:[§4\.1](https://arxiv.org/html/2605.12792#S4.SS1.p2.1)\.
- T\. Hapuarachchi, J\. Lin, K\. Xiong, M\. Rahouti, and G\. Ost \(2024\)Nonlinear transformations against unlearnable datasets\.CoRRabs/2406\.02883\.External Links:[Link](https://doi.org/10.48550/arXiv.2406.02883)Cited by:[§1](https://arxiv.org/html/2605.12792#S1.tab1.1.5.4.3.1.1),[§3\.2\.2](https://arxiv.org/html/2605.12792#S3.SS2.SSS2.p2.1),[§3\.2\.2](https://arxiv.org/html/2605.12792#S3.SS2.SSS2.p3.1),[§3\.2\.2](https://arxiv.org/html/2605.12792#S3.SS2.SSS2.p5.1)\.
- T\. Hapuarachchi and K\. Xiong \(2025\)Advancing ensemble learning against unlearnable data\.Neurocomputing,pp\. 130422\.Cited by:[§4\.5](https://arxiv.org/html/2605.12792#S4.SS5.p1.1)\.
- H\. He, K\. Zha, and D\. Katabi \(2022\)Indiscriminate poisoning attacks on unsupervised contrastive learning\.CoRRabs/2202\.11202\.External Links:[Link](https://arxiv.org/abs/2202.11202),2202\.11202Cited by:[§4\.4](https://arxiv.org/html/2605.12792#S4.SS4.p1.1)\.
- H\. Huang, X\. Ma, S\. M\. Erfani, J\. Bailey, and Y\. Wang \(2021\)Unlearnable examples: making personal data unexploitable\.In9th International Conference on Learning Representations \(ICLR\),External Links:[Link](https://openreview.net/forum?id=iAmZUo0DxC0)Cited by:[§2\.1](https://arxiv.org/html/2605.12792#S2.SS1.p2.1),[§2\.3\.2](https://arxiv.org/html/2605.12792#S2.SS3.SSS2.p1.3),[§2\.3\.2](https://arxiv.org/html/2605.12792#S2.SS3.SSS2.p1.4),[§3\.2\.2](https://arxiv.org/html/2605.12792#S3.SS2.SSS2.p1.1),[Table 2](https://arxiv.org/html/2605.12792#S3.T2.1.4.3.1),[Table 3](https://arxiv.org/html/2605.12792#S3.T3.1.4.3.1),[Table 4](https://arxiv.org/html/2605.12792#S3.T4.1.5.5.1),[§4\.1](https://arxiv.org/html/2605.12792#S4.SS1.p3.1),[§4\.2](https://arxiv.org/html/2605.12792#S4.SS2.p1.1),[§4\.4](https://arxiv.org/html/2605.12792#S4.SS4.p1.1)\.
- S\. H\. Huang, N\. Papernot, I\. J\. Goodfellow, Y\. Duan, and P\. Abbeel \(2017\)Adversarial attacks on neural network policies\.CoRRabs/1702\.02284\.External Links:[Link](http://arxiv.org/abs/1702.02284),1702\.02284Cited by:[§2\.1](https://arxiv.org/html/2605.12792#S2.SS1.p3.1),[§2\.1](https://arxiv.org/html/2605.12792#S2.SS1.p4.9),[§3\.2\.1](https://arxiv.org/html/2605.12792#S3.SS2.SSS1.p3.1)\.
- W\. R\. Huang, J\. Geiping, L\. Fowl, G\. Taylor, and T\. Goldstein \(2020\)MetaPoison: practical general\-purpose clean\-label data poisoning\.InAnnual Conference on Neural Information Processing Systems \(NeurIPS\),External Links:[Link](https://proceedings.neurips.cc/paper/2020/hash/8ce6fc704072e351679ac97d4a985574-Abstract.html)Cited by:[§2\.1](https://arxiv.org/html/2605.12792#S2.SS1.p2.1),[§2\.4](https://arxiv.org/html/2605.12792#S2.SS4.p1.1),[§2\.5](https://arxiv.org/html/2605.12792#S2.SS5.p1.1),[§4\.1](https://arxiv.org/html/2605.12792#S4.SS1.p2.1)\.
- Z\. Huang and T\. Zhang \(2019\)Black\-box adversarial attack with transferable model\-based embedding\.CoRRabs/1911\.07140\.External Links:[Link](http://arxiv.org/abs/1911.07140),1911\.07140Cited by:[§4\.1](https://arxiv.org/html/2605.12792#S4.SS1.p2.1)\.
- A\. Jacot, C\. Hongler, and F\. Gabriel \(2018\)Neural tangent kernel: convergence and generalization in neural networks\.InAnnual Conference on Neural Information Processing Systems \(NeurIPS\),pp\. 8580–8589\.External Links:[Link](https://proceedings.neurips.cc/paper/2018/hash/5a4be1fa34e62bb8a6ec6b91d2462f5a-Abstract.html)Cited by:[§1](https://arxiv.org/html/2605.12792#S1.tab1.3),[§1](https://arxiv.org/html/2605.12792#S1.tab1.6),[§2\.2](https://arxiv.org/html/2605.12792#S2.SS2.p1.1),[§2\.2](https://arxiv.org/html/2605.12792#S2.SS2.p2.1),[Definition 2\.2](https://arxiv.org/html/2605.12792#S2.Thmtheorem2.p1.2)\.
- A\. Krizhevsky, G\. Hinton,et al\.\(2009\)Learning multiple layers of features from tiny images\.Cited by:[§4\.2](https://arxiv.org/html/2605.12792#S4.SS2.p1.1)\.
- A\. Kurakin, I\. J\. Goodfellow, and S\. Bengio \(2017\)Adversarial examples in the physical world\.In5th International Conference on Learning Representations \(ICLR\),External Links:[Link](https://openreview.net/forum?id=HJGU3Rodl)Cited by:[§4\.2](https://arxiv.org/html/2605.12792#S4.SS2.p2.1)\.
- J\. Lee, Y\. Bahri, R\. Novak, S\. S\. Schoenholz, J\. Pennington, and J\. Sohl\-Dickstein \(2017\)Deep neural networks as gaussian processes\.CoRRabs/1711\.00165\.External Links:[Link](http://arxiv.org/abs/1711.00165),1711\.00165Cited by:[§2\.2](https://arxiv.org/html/2605.12792#S2.SS2.p1.1)\.
- J\. Li, Y\. Chen, Y\. Xing, Y\. Gu, and X\. Lan \(2025\)A survey on unlearnable data\.CoRRabs/2503\.23536\.External Links:[Link](https://doi.org/10.48550/arXiv.2503.23536)Cited by:[§1](https://arxiv.org/html/2605.12792#S1.tab1.4)\.
- X\. Li, M\. Liu, and S\. Gao \(2023\)Make text unlearnable: exploiting effective patterns to protect personal data\.arXiv preprint arXiv:2307\.00456\.Cited by:[§4\.6](https://arxiv.org/html/2605.12792#S4.SS6.p1.1.1)\.
- Y\. Li, J\. Lin, and K\. Xiong \(2021\)An adversarial attack defending system for securing in\-vehicle networks\.In18th IEEE Annual Consumer Communications & Networking Conference \(CCNC\), Las Vegas, NV, USA,pp\. 1–6\.External Links:[Link](https://doi.org/10.1109/CCNC49032.2021.9369569),[Document](https://dx.doi.org/10.1109/CCNC49032.2021.9369569)Cited by:[§1](https://arxiv.org/html/2605.12792#S1.tab1.2)\.
- J\. Lin, L\. Dang, M\. Rahouti, and K\. Xiong \(2021\)ML attack models: adversarial attacks and data poisoning attacks\.arXiv preprint arXiv:2112\.02797\.Cited by:[§2\.1](https://arxiv.org/html/2605.12792#S2.SS1.p4.9)\.
- H\. Liu, D\. Li, and Y\. Li \(2021a\)Poisonous label attack: black\-box data poisoning attack with enhanced conditional DCGAN\.Neural Process\. Lett\.53\(6\),pp\. 4117–4142\.External Links:[Link](https://doi.org/10.1007/s11063-021-10584-w),[Document](https://dx.doi.org/10.1007/s11063-021-10584-w)Cited by:[§2\.5](https://arxiv.org/html/2605.12792#S2.SS5.p1.1)\.
- X\. Liu, X\. Jia, Y\. Xun, S\. Liang, and X\. Cao \(2024a\)Multimodal unlearnable examples: protecting data against multimodal contrastive learning\.In32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia,pp\. 8024–8033\.External Links:[Link](https://doi.org/10.1145/3664647.3680708)Cited by:[§4\.6](https://arxiv.org/html/2605.12792#S4.SS6.p1.1.1)\.
- Y\. Liu, X\. Chen, C\. Liu, and D\. Song \(2016\)Delving into transferable adversarial examples and black\-box attacks\.CoRRabs/1611\.02770\.External Links:[Link](http://arxiv.org/abs/1611.02770),1611\.02770Cited by:[§4\.1](https://arxiv.org/html/2605.12792#S4.SS1.p1.1)\.
- Y\. Liu, K\. Xu, X\. Chen, and L\. Sun \(2024b\)Stable unlearnable example: enhancing the robustness of unlearnable examples via stable error\-minimizing noise\.InThirty\-Sixth Conference on Innovative Applications of Artificial Intelligence, Vancouver, Canada,pp\. 3783–3791\.External Links:[Link](https://doi.org/10.1609/aaai.v38i4.28169)Cited by:[§1](https://arxiv.org/html/2605.12792#S1.tab1.1.3.2.3.1.1),[§3\.1](https://arxiv.org/html/2605.12792#S3.SS1.p2.1.1)\.
- Z\. Liu, Z\. Zhao, A\. Kolmus, T\. Berns, T\. van Laarhoven, T\. Heskes, and M\. A\. Larson \(2021b\)Going grayscale: the road to understanding and improving unlearnable examples\.CoRRabs/2111\.13244\.External Links:[Link](https://arxiv.org/abs/2111.13244),2111\.13244Cited by:[§1](https://arxiv.org/html/2605.12792#S1.tab1.1.5.4.3.1.1),[§3\.2\.2](https://arxiv.org/html/2605.12792#S3.SS2.SSS2.p1.1),[§3\.2\.2](https://arxiv.org/html/2605.12792#S3.SS2.SSS2.p5.1)\.
- Z\. Liu, Z\. Zhao, and M\. A\. Larson \(2023\)Image shortcut squeezing: countering perturbative availability poisons with compression\.CoRRabs/2301\.13838\.External Links:[Link](https://doi.org/10.48550/arXiv.2301.13838),[Document](https://dx.doi.org/10.48550/arXiv.2301.13838),2301\.13838Cited by:[§1](https://arxiv.org/html/2605.12792#S1.tab1.1.5.4.3.1.1),[§3\.2\.2](https://arxiv.org/html/2605.12792#S3.SS2.SSS2.p1.1),[§3\.2\.2](https://arxiv.org/html/2605.12792#S3.SS2.SSS2.p2.1),[§3\.2\.2](https://arxiv.org/html/2605.12792#S3.SS2.SSS2.p4.1),[§3\.2\.2](https://arxiv.org/html/2605.12792#S3.SS2.SSS2.p5.1)\.
- T\. Long, Q\. Gao, L\. Xu, and Z\. Zhou \(2022\)A survey on adversarial attacks in computer vision: taxonomy, visualization and future directions\.Comput\. Secur\.121,pp\. 102847\.External Links:[Link](https://doi.org/10.1016/j.cose.2022.102847),[Document](https://dx.doi.org/10.1016/j.cose.2022.102847)Cited by:[§1](https://arxiv.org/html/2605.12792#S1.tab1.4)\.
- Y\. Ma, T\. Xie, J\. Li, and R\. Maciejewski \(2020\)Explaining vulnerabilities to adversarial machine learning through visual analytics\.IEEE Trans\. Vis\. Comput\. Graph\.26\(1\),pp\. 1075–1085\.External Links:[Link](https://doi.org/10.1109/TVCG.2019.2934631),[Document](https://dx.doi.org/10.1109/TVCG.2019.2934631)Cited by:[§4\.2](https://arxiv.org/html/2605.12792#S4.SS2.p2.1)\.
- G\. R\. Machado, E\. Silva, and R\. R\. Goldschmidt \(2020\)Adversarial machine learning in image classification: A survey towards the defender’s perspective\.CoRRabs/2009\.03728\.External Links:[Link](https://arxiv.org/abs/2009.03728),2009\.03728Cited by:[§2\.1](https://arxiv.org/html/2605.12792#S2.SS1.p2.1),[§2\.1](https://arxiv.org/html/2605.12792#S2.SS1.p3.1)\.
- \[59\]Machine learning on aws\(Website\)External Links:[Link](https://aws.amazon.com/machine-learning/)Cited by:[§4\.1](https://arxiv.org/html/2605.12792#S4.SS1.p2.1)\.
- A\. Madry, A\. Makelov, L\. Schmidt, D\. Tsipras, and A\. Vladu \(2017\)Towards deep learning models resistant to adversarial attacks\.CoRRabs/1706\.06083\.External Links:[Link](http://arxiv.org/abs/1706.06083),1706\.06083Cited by:[§2\.1](https://arxiv.org/html/2605.12792#S2.SS1.p3.1),[§2\.4](https://arxiv.org/html/2605.12792#S2.SS4.p1.1),[§2\.4](https://arxiv.org/html/2605.12792#S2.SS4.p1.2),[§3\.2\.1](https://arxiv.org/html/2605.12792#S3.SS2.SSS1.p1.1),[§3\.2\.1](https://arxiv.org/html/2605.12792#S3.SS2.SSS1.p3.1),[§3\.4](https://arxiv.org/html/2605.12792#S3.SS4.p1.10)\.
- M\. H\. Meng, G\. Bai, S\. G\. Teo, Z\. Hou, Y\. Xiao, Y\. Lin, and J\. S\. Dong \(2022\)Adversarial robustness of deep neural networks: A survey from a formal verification perspective\.CoRRabs/2206\.12227\.External Links:[Link](https://doi.org/10.48550/arXiv.2206.12227),[Document](https://dx.doi.org/10.48550/arXiv.2206.12227),2206\.12227Cited by:[§1](https://arxiv.org/html/2605.12792#S1.tab1.4)\.
- S\. Moosavi\-Dezfooli, A\. Fawzi, and P\. Frossard \(2016\)DeepFool: A simple and accurate method to fool deep neural networks\.InIEEE Conference on Computer Vision and Pattern Recognition \(CVPR\), Las Vegas, NV, USA,pp\. 2574–2582\.External Links:[Link](https://doi.org/10.1109/CVPR.2016.282),[Document](https://dx.doi.org/10.1109/CVPR.2016.282)Cited by:[§2\.1](https://arxiv.org/html/2605.12792#S2.SS1.p3.1)\.
- N\. Papernot, P\. D\. McDaniel, I\. J\. Goodfellow, S\. Jha, Z\. B\. Celik, and A\. Swami \(2017\)Practical black\-box attacks against machine learning\.InProceedings of the Asia Conference on Computer and Communications Security \(AsiaCCS\),pp\. 506–519\.External Links:[Link](https://doi.org/10.1145/3052973.3053009),[Document](https://dx.doi.org/10.1145/3052973.3053009)Cited by:[§2\.5](https://arxiv.org/html/2605.12792#S2.SS5.p1.1),[§4\.1](https://arxiv.org/html/2605.12792#S4.SS1.p2.1)\.
- N\. Papernot, P\. D\. McDaniel, and I\. J\. Goodfellow \(2016\)Transferability in machine learning: from phenomena to black\-box attacks using adversarial samples\.CoRRabs/1605\.07277\.External Links:[Link](http://arxiv.org/abs/1605.07277),1605\.07277Cited by:[§4\.1](https://arxiv.org/html/2605.12792#S4.SS1.p1.1)\.
- A\. S\. Poznyak, I\. Chairez, and T\. Poznyak \(2019\)A survey on artificial neural networks application for identification and control in environmental engineering: biological and chemical systems with uncertain models\.Annu\. Rev\. Control\.48,pp\. 250–272\.External Links:[Link](https://doi.org/10.1016/j.arcontrol.2019.07.003),[Document](https://dx.doi.org/10.1016/j.arcontrol.2019.07.003)Cited by:[§1](https://arxiv.org/html/2605.12792#S1.tab1.2)\.
- T\. Qin, X\. Gao, J\. Zhao, K\. Ye, and C\. Xu \(2023\)Apbench: a unified benchmark for availability poisoning attacks and defenses\.arXiv preprint arXiv:2308\.03258\.Cited by:[§4\.2](https://arxiv.org/html/2605.12792#S4.SS2.p2.1.1)\.
- H\. Qiu, Y\. Zeng, T\. Zhang, Y\. Jiang, and M\. Qiu \(2020\)FenceBox: A platform for defeating adversarial examples with data augmentation techniques\.CoRRabs/2012\.01701\.External Links:[Link](https://arxiv.org/abs/2012.01701),2012\.01701Cited by:[§3\.2\.2](https://arxiv.org/html/2605.12792#S3.SS2.SSS2.p1.1)\.
- V\. S\. Sadasivan, M\. Soltanolkotabi, and S\. Feizi \(2023a\)CUDA: convolution\-based unlearnable datasets\.CoRRabs/2303\.04278\.External Links:[Link](https://doi.org/10.48550/arXiv.2303.04278),[Document](https://dx.doi.org/10.48550/arXiv.2303.04278),2303\.04278Cited by:[§3\.1](https://arxiv.org/html/2605.12792#S3.SS1.p4.1),[§3\.2\.1](https://arxiv.org/html/2605.12792#S3.SS2.SSS1.p2.1),[§4\.6](https://arxiv.org/html/2605.12792#S4.SS6.p1.1.1)\.
- V\. S\. Sadasivan, M\. Soltanolkotabi, and S\. Feizi \(2023b\)FUN: filter\-based unlearnable datasets\.External Links:[Link](https://openreview.net/forum?id=iaCzfh6vtwQ)Cited by:[§1](https://arxiv.org/html/2605.12792#S1.tab1.1.4.3.3.1.1),[§1](https://arxiv.org/html/2605.12792#S1.tab1.1.6.5.3.1.1),[§3\.2\.1](https://arxiv.org/html/2605.12792#S3.SS2.SSS1.p2.1),[§3\.3](https://arxiv.org/html/2605.12792#S3.SS3.p1.1)\.
- O\. Sagi and L\. Rokach \(2018\)Ensemble learning: A survey\.WIREs Data Mining Knowl\. Discov\.8\(4\)\.External Links:[Link](https://doi.org/10.1002/widm.1249),[Document](https://dx.doi.org/10.1002/widm.1249)Cited by:[§4\.5](https://arxiv.org/html/2605.12792#S4.SS5.p1.1)\.
- P\. S\. Segura, V\. Singla, J\. Geiping, M\. Goldblum, T\. Goldstein, and D\. W\. Jacobs \(2022\)Autoregressive perturbations for data poisoning\.CoRRabs/2206\.03693\.External Links:[Link](https://doi.org/10.48550/arXiv.2206.03693),[Document](https://dx.doi.org/10.48550/arXiv.2206.03693),2206\.03693Cited by:[Definition 2\.1](https://arxiv.org/html/2605.12792#S2.Thmtheorem1.p1.1),[§3\.1](https://arxiv.org/html/2605.12792#S3.SS1.p3.1.1),[§3\.1](https://arxiv.org/html/2605.12792#S3.SS1.p4.1),[§3\.3](https://arxiv.org/html/2605.12792#S3.SS3.p2.1.1),[Table 2](https://arxiv.org/html/2605.12792#S3.T2.1.9.8.1),[Table 3](https://arxiv.org/html/2605.12792#S3.T3.1.9.8.1),[Table 4](https://arxiv.org/html/2605.12792#S3.T4.1.10.10.1)\.
- P\. S\. Segura, V\. Singla, J\. Geiping, M\. Goldblum, and T\. Goldstein \(2023\)What can we learn from unlearnable datasets?\.InNeural Information Processing Systems, NeurIPS, New Orleans, LA, USA,External Links:[Link](http://papers.nips.cc/paper%5C_files/paper/2023/hash/ee5bb72130c332c3d4bf8d231e617506-Abstract-Conference.html)Cited by:[§3\.2\.2](https://arxiv.org/html/2605.12792#S3.SS2.SSS2.p2.1.1),[§3\.3](https://arxiv.org/html/2605.12792#S3.SS3.p2.1.1),[§3\.3](https://arxiv.org/html/2605.12792#S3.SS3.p3.1.1)\.
- A\. Shafahi, W\. R\. Huang, M\. Najibi, O\. Suciu, C\. Studer, T\. Dumitras, and T\. Goldstein \(2018\)Poison frogs\! targeted clean\-label poisoning attacks on neural networks\.InAnnual Conference on Neural Information Processing Systems \(NeurIPS\),pp\. 6106–6116\.External Links:[Link](https://proceedings.neurips.cc/paper/2018/hash/22722a343513ed45f14905eb07621686-Abstract.html)Cited by:[§2\.1](https://arxiv.org/html/2605.12792#S2.SS1.p2.1),[§2\.5](https://arxiv.org/html/2605.12792#S2.SS5.p1.1)\.
- S\. Shan, E\. Wenger, J\. Zhang, H\. Li, H\. Zheng, and B\. Y\. Zhao \(2020\)Fawkes: protecting privacy against unauthorized deep learning models\.In29th USENIX Security Symposium \(USENIX\),S\. Capkun and F\. Roesner \(Eds\.\),pp\. 1589–1604\.External Links:[Link](https://www.usenix.org/conference/usenixsecurity20/presentation/shan)Cited by:[§4\.2](https://arxiv.org/html/2605.12792#S4.SS2.p1.1)\.
- M\. Sharif, S\. Bhagavatula, L\. Bauer, and M\. K\. Reiter \(2016\)Accessorize to a crime: real and stealthy attacks on state\-of\-the\-art face recognition\.InProceedings of the ACM SIGSAC Conference on Computer and Communications Security,pp\. 1528–1540\.External Links:[Link](https://doi.org/10.1145/2976749.2978392),[Document](https://dx.doi.org/10.1145/2976749.2978392)Cited by:[§4\.2](https://arxiv.org/html/2605.12792#S4.SS2.p2.1)\.
- S\. Sharma, J\. J\. Zou, G\. Fang, P\. Shukla, and W\. Cai \(2024\)A review of image watermarking for identity protection and verification\.Multimedia Tools and Applications83\(11\),pp\. 31829–31891\.Cited by:[§2\.5](https://arxiv.org/html/2605.12792#S2.SS5.p2.1)\.
- R\. Shokri, M\. Stronati, and V\. Shmatikov \(2016\)Membership inference attacks against machine learning models\.CoRRabs/1610\.05820\.External Links:[Link](http://arxiv.org/abs/1610.05820)Cited by:[§2\.5](https://arxiv.org/html/2605.12792#S2.SS5.p2.1)\.
- C\. Szegedy, W\. Zaremba, I\. Sutskever, J\. Bruna, D\. Erhan, I\. J\. Goodfellow, and R\. Fergus \(2014\)Intriguing properties of neural networks\.In2nd International Conference on Learning Representations \(ICLR\),External Links:[Link](http://arxiv.org/abs/1312.6199)Cited by:[§2\.1](https://arxiv.org/html/2605.12792#S2.SS1.p3.1)\.
- L\. Tao, L\. Feng, H\. Wei, J\. Yi, S\. Huang, and S\. Chen \(2022\)Can adversarial training be manipulated by non\-robust features?\.CoRR\.External Links:[Link](https://arxiv.org/abs/2201.13329),2201\.13329Cited by:[§1](https://arxiv.org/html/2605.12792#S1.tab1.1.4.3.3.1.1),[§2\.4](https://arxiv.org/html/2605.12792#S2.SS4.p1.1),[§3\.2\.1](https://arxiv.org/html/2605.12792#S3.SS2.SSS1.p2.1),[§3\.2\.1](https://arxiv.org/html/2605.12792#S3.SS2.SSS1.p3.1)\.
- Y\. Tian, W\. Zhang, A\. Simpson, Y\. Liu, and Z\. L\. Jiang \(2021\)Defending against data poisoning attacks: from distributed learning to federated learning\.The Computer Journal\.Cited by:[§4\.3](https://arxiv.org/html/2605.12792#S4.SS3.p1.1)\.
- Z\. Tian, L\. Cui, J\. Liang, and S\. Yu \(2022\)A comprehensive survey on poisoning attacks and countermeasures in machine learning\.ACM Comput\. Surv\.55\(8\)\.External Links:ISSN 0360\-0300,[Link](https://doi.org/10.1145/3551636),[Document](https://dx.doi.org/10.1145/3551636)Cited by:[§1](https://arxiv.org/html/2605.12792#S1.tab1.4)\.
- R\. Tomsett, K\. Chan, and S\. Chakraborty \(2019\)Model poisoning attacks against distributed machine learning systems\.InArtificial Intelligence and Machine Learning for Multi\-Domain Operations Applications,Vol\.11006,pp\. 481–489\.Cited by:[§4\.3](https://arxiv.org/html/2605.12792#S4.SS3.p1.1)\.
- N\. Tsilivis and J\. Kempe \(2022\)The NTK adversary: an approach to adversarial attacks without any model access\.External Links:[Link](https://openreview.net/forum?id=M5hiCgL7qt)Cited by:[§1](https://arxiv.org/html/2605.12792#S1.tab1.1.7.6.3.1.1),[§3\.4](https://arxiv.org/html/2605.12792#S3.SS4.p1.10)\.
- \[84\]Use a color matrix to transform a single color\(Website\)External Links:[Link](https://learn.microsoft.com/en-us/dotnet/desktop/winforms/advanced/how-to-use-a-color-matrix-to-transform-a-single-color?view=netframeworkdesktop-4.8)Cited by:[§3\.2\.2](https://arxiv.org/html/2605.12792#S3.SS2.SSS2.p5.1)\.
- L\. van der Maaten and G\. Hinton \(2008\)Visualizing data using t\-sne\.Journal of Machine Learning Research9\(86\),pp\. 2579–2605\.External Links:[Link](http://jmlr.org/papers/v9/vandermaaten08a.html)Cited by:[§3\.3](https://arxiv.org/html/2605.12792#S3.SS3.p1.1)\.
- B\. van Rooyen, A\. K\. Menon, and R\. C\. Williamson \(2015\)Learning with symmetric label noise: the importance of being unhinged\.InAnnual Conference on Neural Information Processing Systems \(NeurIPS\),pp\. 10–18\.External Links:[Link](https://proceedings.neurips.cc/paper/2015/hash/45c48cce2e2d7fbdea1afc51c7c6ad26-Abstract.html)Cited by:[§2\.5](https://arxiv.org/html/2605.12792#S2.SS5.p1.1)\.
- J\. Verbraeken, M\. Wolting, J\. Katzy, J\. Kloppenburg, T\. Verbelen, and J\. S\. Rellermeyer \(2020\)A survey on distributed machine learning\.ACM Comput\. Surv\.53\(2\),pp\. 30:1–30:33\.External Links:[Link](https://doi.org/10.1145/3377454),[Document](https://dx.doi.org/10.1145/3377454)Cited by:[§4\.3](https://arxiv.org/html/2605.12792#S4.SS3.p1.1)\.
- Y\. Vorobeychik and M\. Kantarcioglu \(2018\)Adversarial machine learning\.Synthesis Lectures on Artificial Intelligence and Machine Learning12\(3\),pp\. 1–169\.Cited by:[§2\.1](https://arxiv.org/html/2605.12792#S2.SS1.p5.1)\.
- C\. Wang, J\. Chen, Y\. Yang, X\. Ma, and J\. Liu \(2022\)Poisoning attacks and countermeasures in intelligent networks: status quo and prospects\.Digit\. Commun\. Networks8\(2\),pp\. 225–234\.External Links:[Link](https://doi.org/10.1016/j.dcan.2021.07.009),[Document](https://dx.doi.org/10.1016/j.dcan.2021.07.009)Cited by:[§1](https://arxiv.org/html/2605.12792#S1.tab1.4),[§2\.1](https://arxiv.org/html/2605.12792#S2.SS1.p5.1)\.
- X\. Wang, M\. Li, P\. Xu, W\. Liu, L\. Y\. Zhang, S\. Hu, and Y\. Zhang \(2024a\)PointAPA: towards availability poisoning attacks in 3d point clouds\.InEuropean Symposium on Research in Computer Security,pp\. 125–145\.Cited by:[§4\.6](https://arxiv.org/html/2605.12792#S4.SS6.p1.1.1)\.
- Y\. Wang, Y\. Zhu, and X\. Gao \(2024b\)Efficient availability attacks against supervised and contrastive learning simultaneously\.InNeural Information Processing Systems, NeurIPS, Vancouver, BC, Canada,External Links:[Link](http://papers.nips.cc/paper%5C_files/paper/2024/hash/85826ad1eb4602a2962b7cdbe129b341-Abstract-Conference.html)Cited by:[§4\.4](https://arxiv.org/html/2605.12792#S4.SS4.p1.1.1)\.
- Z\. Wang, Y\. Wang, and Y\. Wang \(2021\)Fooling adversarial training with inducing noise\.CoRRabs/2111\.10130\.External Links:[Link](https://arxiv.org/abs/2111.10130),2111\.10130Cited by:[§2\.1](https://arxiv.org/html/2605.12792#S2.SS1.p2.1),[§2\.4](https://arxiv.org/html/2605.12792#S2.SS4.p1.2),[§3\.2\.1](https://arxiv.org/html/2605.12792#S3.SS2.SSS1.p2.1),[§4\.2](https://arxiv.org/html/2605.12792#S4.SS2.p1.1)\.
- G\. Wen, Z\. Hou, H\. Li, D\. Li, L\. Jiang, and E\. Xun \(2017\)Ensemble of deep neural networks with probability\-based fusion for facial expression recognition\.Cogn\. Comput\.9\(5\),pp\. 597–610\.External Links:[Link](https://doi.org/10.1007/s12559-017-9472-6),[Document](https://dx.doi.org/10.1007/s12559-017-9472-6)Cited by:[§4\.5](https://arxiv.org/html/2605.12792#S4.SS5.p1.1)\.
- R\. Wen, Z\. Zhao, Z\. Liu, M\. Backes, T\. Wang, and Y\. Zhang \(2023\)Is adversarial training really a silver bullet for mitigating data poisoning?\.InThe Eleventh International Conference on Learning Representations, ICLR, Kigali, Rwanda,Cited by:[§2\.1](https://arxiv.org/html/2605.12792#S2.SS1.p2.1)\.
- S\. Wu, S\. Chen, C\. Xie, and X\. Huang \(2022\)One\-pixel shortcut: on the learning preference of deep neural networks\.CoRRabs/2205\.12141\.External Links:[Link](https://doi.org/10.48550/arXiv.2205.12141),[Document](https://dx.doi.org/10.48550/arXiv.2205.12141),2205\.12141Cited by:[§2\.1](https://arxiv.org/html/2605.12792#S2.SS1.p2.1),[§3\.1](https://arxiv.org/html/2605.12792#S3.SS1.p3.1.1),[§3\.2\.1](https://arxiv.org/html/2605.12792#S3.SS2.SSS1.p2.1),[Table 2](https://arxiv.org/html/2605.12792#S3.T2.1.8.7.1),[Table 3](https://arxiv.org/html/2605.12792#S3.T3.1.8.7.1),[Table 4](https://arxiv.org/html/2605.12792#S3.T4.1.9.9.1)\.
- Y\. Xiao, J\. Wu, Z\. Lin, and X\. Zhao \(2018\)A deep learning\-based multi\-model ensemble method for cancer prediction\.Comput\. Methods Programs Biomed\.153,pp\. 1–9\.External Links:[Link](https://doi.org/10.1016/j.cmpb.2017.09.005),[Document](https://dx.doi.org/10.1016/j.cmpb.2017.09.005)Cited by:[§4\.5](https://arxiv.org/html/2605.12792#S4.SS5.p1.1)\.
- D\. Yi, Z\. Lei, S\. Liao, and S\. Z\. Li \(2014\)Learning face representation from scratch\.CoRRabs/1411\.7923\.External Links:[Link](http://arxiv.org/abs/1411.7923),1411\.7923Cited by:[§4\.2](https://arxiv.org/html/2605.12792#S4.SS2.p1.1)\.
- D\. Yu, H\. Zhang, W\. Chen, J\. Yin, and T\. Liu \(2021\)Indiscriminate poisoning attacks are shortcuts\.CoRRabs/2111\.00898\.External Links:[Link](https://arxiv.org/abs/2111.00898),2111\.00898Cited by:[§1](https://arxiv.org/html/2605.12792#S1.tab1.1.3.2.3.1.1),[§1](https://arxiv.org/html/2605.12792#S1.tab1.1.6.5.3.1.1),[§3\.1](https://arxiv.org/html/2605.12792#S3.SS1.p2.1),[§3\.1](https://arxiv.org/html/2605.12792#S3.SS1.p3.1.1),[§3\.3](https://arxiv.org/html/2605.12792#S3.SS3.p1.1),[§3\.3](https://arxiv.org/html/2605.12792#S3.SS3.p2.1),[§3\.3](https://arxiv.org/html/2605.12792#S3.SS3.p3.1.1),[Table 2](https://arxiv.org/html/2605.12792#S3.T2.1.6.5.1),[Table 3](https://arxiv.org/html/2605.12792#S3.T3.1.6.5.1),[Table 4](https://arxiv.org/html/2605.12792#S3.T4.1.7.7.1),[§6\.2](https://arxiv.org/html/2605.12792#S6.SS2.p1.1)\.
- Y\. Yu, Y\. Wang, S\. Xia, W\. Yang, S\. Lu, Y\. Tan, and A\. C\. Kot \(2024\)Purify unlearnable examples via rate\-constrained variational autoencoders\.InForty\-first International Conference on Machine Learning, ICML 2024, Vienna, Austria, July 21\-27, 2024,External Links:[Link](https://openreview.net/forum?id=0LBNdbmQCM)Cited by:[§3\.2\.2](https://arxiv.org/html/2605.12792#S3.SS2.SSS2.p3.1)\.
- C\. Yuan and S\. Wu \(2021\)Neural tangent generalization attacks\.InProceedings of the 38th International Conference on Machine Learning \(ICML\),pp\. 12230–12240\.External Links:[Link](http://proceedings.mlr.press/v139/yuan21b.html)Cited by:[2nd item](https://arxiv.org/html/2605.12792#S1.I1.i2.p1.1),[§1](https://arxiv.org/html/2605.12792#S1.tab1.1.3.2.3.1.1),[§1](https://arxiv.org/html/2605.12792#S1.tab1.1.7.6.3.1.1),[§1](https://arxiv.org/html/2605.12792#S1.tab1.3),[§1](https://arxiv.org/html/2605.12792#S1.tab1.6),[§2\.2](https://arxiv.org/html/2605.12792#S2.SS2.p2.1),[§2\.5](https://arxiv.org/html/2605.12792#S2.SS5.p1.1),[Definition 2\.1](https://arxiv.org/html/2605.12792#S2.Thmtheorem1.p1.1),[§3\.1](https://arxiv.org/html/2605.12792#S3.SS1.p1.1),[§3\.2\.1](https://arxiv.org/html/2605.12792#S3.SS2.SSS1.p4.1),[Table 2](https://arxiv.org/html/2605.12792#S3.T2.1.2.1.1),[Table 3](https://arxiv.org/html/2605.12792#S3.T3.1.2.1.1),[Table 4](https://arxiv.org/html/2605.12792#S3.T4.1.3.3.1),[§3](https://arxiv.org/html/2605.12792#S3.p1.1),[§6\.2](https://arxiv.org/html/2605.12792#S6.SS2.p1.1)\.
- S\. Yun, D\. Han, S\. Chun, S\. J\. Oh, Y\. Yoo, and J\. Choe \(2019\)CutMix: regularization strategy to train strong classifiers with localizable features\.InIEEE/CVF International Conference on Computer Vision \(ICCV\), Seoul, Korea \(South\),pp\. 6022–6031\.External Links:[Link](https://doi.org/10.1109/ICCV.2019.00612),[Document](https://dx.doi.org/10.1109/ICCV.2019.00612)Cited by:[§3\.2\.2](https://arxiv.org/html/2605.12792#S3.SS2.SSS2.p1.1)\.
- Y\. Zeng, H\. Qiu, G\. Memmi, and M\. Qiu \(2020\)A data augmentation\-based defense method against adversarial attacks in neural networks\.InAlgorithms and Architectures for Parallel Processing \- 20th International Conference \(ICA3PP\), New York City, NY, USA,M\. Qiu \(Ed\.\),Lecture Notes in Computer Science, Vol\.12453,pp\. 274–289\.External Links:[Link](https://doi.org/10.1007/978-3-030-60239-0%5C_19),[Document](https://dx.doi.org/10.1007/978-3-030-60239-0%5F19)Cited by:[§3\.2\.2](https://arxiv.org/html/2605.12792#S3.SS2.SSS2.p1.1)\.
- H\. Zhang, M\. Cissé, Y\. N\. Dauphin, and D\. Lopez\-Paz \(2018\)Mixup: beyond empirical risk minimization\.In6th International Conference on Learning Representations \(ICLR\), Vancouver, BC, Canada,External Links:[Link](https://openreview.net/forum?id=r1Ddp1-Rb)Cited by:[§3\.2\.2](https://arxiv.org/html/2605.12792#S3.SS2.SSS2.p1.1)\.
- B\. Zhao and Y\. Lao \(2022\)Towards class\-oriented poisoning attacks against neural networks\.InIEEE/CVF Winter Conference on Applications of Computer Vision \(WACV\),pp\. 2244–2253\.External Links:[Link](https://doi.org/10.1109/WACV51458.2022.00230),[Document](https://dx.doi.org/10.1109/WACV51458.2022.00230)Cited by:[§2\.1](https://arxiv.org/html/2605.12792#S2.SS1.p2.1)\.
- C\. Zhu, W\. R\. Huang, H\. Li, G\. Taylor, C\. Studer, and T\. Goldstein \(2019\)Transferable clean\-label poisoning attacks on deep neural nets\.InProceedings of the 36th International Conference on Machine Learning \(ICML\),Vol\.97,pp\. 7614–7623\.External Links:[Link](http://proceedings.mlr.press/v97/zhu19a.html)Cited by:[§2\.1](https://arxiv.org/html/2605.12792#S2.SS1.p2.1)\.
- Y\. Zhu, L\. Yu, and X\. Gao \(2024\)Detection and defense of unlearnable examples\.InThirty\-Eighth AAAI Conference on Artificial Intelligence, AAAI 2024, Thirty\-Sixth Conference on Innovative Applications of Artificial Intelligence, IAAI 2024, Fourteenth Symposium on Educational Advances in Artificial Intelligence, EAAI 2014, February 20\-27, 2024, Vancouver, Canada,pp\. 17211–17219\.External Links:[Link](https://doi.org/10.1609/aaai.v38i15.29667)Cited by:[§3\.3](https://arxiv.org/html/2605.12792#S3.SS3.p2.1.1)\.
## 6\.Appendix
### 6\.1\.Notations
Table 6\.Notation### 6\.2\.Experimental Settings
The NTGA dataset was collected from the data released by the authors in their GitHub repository\(Yuan and Wu,[2021](https://arxiv.org/html/2605.12792#bib.bib1)\)\. Error\-minimizing, error\-maximizing, DeepConfuse, and synthetic data were collected from the authors of Yu et al\.\(Yuet al\.,[2021](https://arxiv.org/html/2605.12792#bib.bib5)\)\. Robust Error\-Minimizing, One\-Pixel Shortcut, and Autoregressive datasets were created using code from their GitHub repositories with default settings\. According to Yu et al\.\(Yuet al\.,[2021](https://arxiv.org/html/2605.12792#bib.bib5)\), DeepConfuse used an 8\-layer U\-Net as the surrogate model, while error\-minimizing and error\-maximizing used ResNet\-18 models\. NTGA used a CNN surrogate model\. Robust Error\-Minimizing employed a ResNet\-18 surrogate model\. Synthetic and One\-Pixel Shortcut do not involve surrogate models\. Perturbation budgets followed the default or the best conditions mentioned in the paper\(Yuet al\.,[2021](https://arxiv.org/html/2605.12792#bib.bib5)\)\.
In our experiments, we used the SGD optimizer with 0\.9 momentum\. The batch size was set to 100, and training was conducted for 30 epochs with a learning rate of 0\.001\. The loss function is cross\-entropy loss\. We did not use data augmentation techniques during training, as they can impact the protection given by data protection approaches\.Similar Articles
State-Space NTK Collapse Near Bifurcations
This paper develops a local theory of gradient descent near bifurcations in dynamical models, showing that the state-space neural tangent kernel collapses to a rank-one operator that dominates learning dynamics, making optimization effectively low-dimensional and predictable from normal forms.
LLM-Agnostic Semantic Representation Attack
This paper introduces Semantic Representation Attack (SRA), a novel LLM-agnostic method that optimizes for malicious semantic representations rather than exact text, achieving high attack success rates across multiple open-source models.
Adversarial attacks on neural network policies
OpenAI researchers demonstrate that adversarial attacks, previously studied in computer vision, are also effective against neural network policies in reinforcement learning, showing significant performance degradation even with small imperceptible perturbations in white-box and black-box settings.
Harder to Defend: Towards Chinese Toxicity Attacks via Implicit Enhancement and Obfuscation Rewriting
The paper introduces CITA, a framework for generating implicit toxicity attacks in Chinese to evaluate and improve LLM toxicity detectors, finding high attack success rates across tested models.
Can Subgraph Explanations Be Weaponized to Steal Graph Neural Networks?
This paper presents the first model extraction attack on graph classification under strict black-box constraints, exploiting subgraph explanations to estimate decision boundaries. The findings reveal that mandated explainability interfaces create exploitable security vulnerabilities in Graph Neural Network services.