Approximate Machine Unlearning through Manifold Representation Forgetting Guided by Self Mode Connectivity
Summary
This paper proposes ManiF-SMC, a method for approximate machine unlearning that operates entirely in the representation space by pushing erased samples away from their original learned manifold representation toward their nearest semantic neighbors in the retained data, using a margin-based triplet loss guided by a self-mode-connectivity module for adaptive margins.
View Cached Full Text
Cached at: 05/25/26, 08:55 AM
# Approximate Machine Unlearning through Manifold Representation Forgetting Guided by Self Mode Connectivity
Source: [https://arxiv.org/html/2605.22871](https://arxiv.org/html/2605.22871)
To appear at KDD 2026\. Author version
\(2026\)
###### Abstract\.
Machine unlearning is a fundamental mechanism that enforces the right to be forgotten\. Existing unlearning studies that rely on label manipulation or task\-gradient reversal often deliver limited unlearning effectiveness\. Moreover, they can undermine the original learning objective and typically do not guarantee equivalence to standard unlearning by retraining\.
In this paper, we proposeManiF\-SMC\(ManifoldForgetting withSelfModeConnectivity\), motivated by the observation that a model retrained on the remaining data tends to classify erased samples by their semantic similarity to the retained data\. We begin with systematically recasting the approximate unlearning as pushing each erased sample away from its original learned manifold representation centroid toward its nearest semantic neighbors in the retained data\. This reformulation aligns unlearning with retraining behavior and operates purely in representation space, reducing reliance on labels and task\-specific gradients\. To tackle the manifold representation\-based unlearning problem, ManiF\-SMC encapsulates the unlearning and representation preservation goals in a margin\-based triplet loss\. Because finding a suitable margin for unlearning is challenging, we propose a self\-mode\-connectivity module that rapidly reconstructs the local manifold to guide the adaptive margins generation for each unlearning case\. Extensive experiments on four representative datasets show that ManiF\-SMC achieves unlearning effectiveness comparable to state\-of\-the\-art approximate methods while operating solely within the model’s representation space\.
Machine Unlearning, Manifold Representation, Mode Connectivity
††copyright:acmlicensed††journalyear:2026††doi:XXXXXXX\.XXXXXXX††conference:Make sure to enter the correct conference title from your rights confirmation email; June 03–05, 2018; Woodstock, NY††isbn:978\-1\-4503\-XXXX\-X/2018/06††ccs:Security and privacy††ccs:Computing methodologies Machine learning## 1\.Introduction
Heightened awareness around data privacy has ushered in rigorous regulatory efforts, exemplified by the legislated laws such as the General Data Protection Regulation \(GDPR\)\(Mantelero,[2013](https://arxiv.org/html/2605.22871#bib.bib26)\)\. These legal frameworks guarantee individuals the right to be forgotten, sparking a nascent area that focuses on removing the influence of specified samples from trained machine learning \(ML\) models, i\.e\., machine unlearning\(Bourtoule et al\.,[2021](https://arxiv.org/html/2605.22871#bib.bib3); Warnecke et al\.,[2024](https://arxiv.org/html/2605.22871#bib.bib42)\)\. Although retraining from scratch offers the most faithful approach to unlearning, its computational overhead is often prohibitive, spurring the development of more practical approximate unlearning methods\(Nguyen et al\.,[2020](https://arxiv.org/html/2605.22871#bib.bib28); Guo et al\.,[2020](https://arxiv.org/html/2605.22871#bib.bib18)\)\.
\(a\)
\(b\)
Figure 1\.Representation space of the Gradient Ascent unlearned model \(left\) and Retriened model \(right\) on CIFAR10 after unlearning 1% \(500\) randomly selected samples\. Small points denote retained samples from different clusters\. Larger points denote forgotten samples, assigned to the retained cluster with the highest semantic similarity\.Most existing approximate unlearning methods are considered from a data\-centric perspective, requiring the manipulation of labels or the reversal of task\-relevant gradients, improving efficiency but limiting unlearning effectiveness\. In particular, we face two main challenges\. Firstly, unlearning based on label manipulation and gradient reversal conflicts with the original learning objective and typically does not guarantee equivalence to gold\-standard unlearning by retraining from scratch\(Thudi et al\.,[2022b](https://arxiv.org/html/2605.22871#bib.bib36); Ebrahimpour\-Boroojeny et al\.,[2025](https://arxiv.org/html/2605.22871#bib.bib11)\)\. Secondly, most approximate unlearning methods construct the unlearning loss heavily relying on the label access to manipulate the label or reverse the task\-specific gradients\(Nguyen et al\.,[2020](https://arxiv.org/html/2605.22871#bib.bib28); Neel et al\.,[2021](https://arxiv.org/html/2605.22871#bib.bib27)\)\. However, in some complex scenarios, such as semantic communication\(Huang et al\.,[2022](https://arxiv.org/html/2605.22871#bib.bib19); Tian et al\.,[2024](https://arxiv.org/html/2605.22871#bib.bib38)\)and federated learning\(Li et al\.,[2020](https://arxiv.org/html/2605.22871#bib.bib24)\), the unlearning server may only access the model or learned representations\. Based on these two challenges, we ask the following question\.
Research Question\.How can we achieve effective approximate unlearning in representation space, reducing reliance on labels, while remaining consistent with retraining\-from\-scratch unlearning?
Inspiration\.To develop an effective approximate unlearning method, we first examine how a retraining\-from\-scratch model represents the erased samples at test time\.[Figure1](https://arxiv.org/html/2605.22871#S1.F1)compares the representation space of Gradient Ascent unlearning\(Graves et al\.,[2021](https://arxiv.org/html/2605.22871#bib.bib17); Thudi et al\.,[2022a](https://arxiv.org/html/2605.22871#bib.bib35)\)and retraining from scratch when forgetting 1% randomly selected CIFAR10 samples\. We observe that, under retraining from scratch, erased samples tend to be mapped close to the retained samples that are most semantically similar to them\. This motivates a representation\-space reformulation of approximate unlearning: we encourage erased samples to move toward the region of their most similar retained samples, so that the resulting model behavior better matches retraining\-from\-scratch unlearning\.
Moreover, we noticed that learned representations have been extensively studied\(Yerxa et al\.,[2023](https://arxiv.org/html/2605.22871#bib.bib47); Wang and Isola,[2020](https://arxiv.org/html/2605.22871#bib.bib40)\)\. In self\-supervised learning, promoting compactness and uniformity is known to improve representation quality and downstream performance\(Tian et al\.,[2021](https://arxiv.org/html/2605.22871#bib.bib37)\)\. Building on this line, the maximum manifold capacity representations \(MMCR\) framework\(Yerxa et al\.,[2023](https://arxiv.org/html/2605.22871#bib.bib47)\)shows that regulating representation geometry increases class separability and supports high\-quality recognition\. These results suggest that controlling representation structure can steer model behavior without relying on label manipulation or task\-specific gradients\. Therefore, we investigate approximate unlearning from a representation\-space perspective and use MMCR as a principled regularization tool\.
Our Work\.In this paper, we begin by systematically revisiting and reformulating the approximate unlearning problem through the lens of manifold representations and the observation of retraining results in[Figure1](https://arxiv.org/html/2605.22871#S1.F1)\. Specifically, we interpret approximate unlearning as*moving an erased sample’s representation away from its original location and toward the centroid of its most semantically similar retained samples, from a manifold representation perspective*\. This reformulation aligns approximate unlearning with retraining behavior and operates purely in representation space, decoupling reliance on labels and task\-specific gradients\. We compare existing approximate unlearning formulations with ours in[Figure2](https://arxiv.org/html/2605.22871#S1.F2)\.
To solve this representation\-based unlearning problem, we proposeManifoldForgetting withSelfModeConnectivity \(ManiF\-SMC\)\. ManiF\-SMC uses a triplet contrastive unlearning loss: it pushes the erased sample away from its original learned representation while pulling it toward the centroid of similar retained representations, with the separation controlled by a margin\. However, both the target centroid \(of similar retained samples\) and an appropriate margin are hard to determine a priori\. We introduce a self mode connectivity module that quickly reconstructs the local manifold structure of these similar retained samples, and uses it to estimate the centroid and adapt the margin during unlearning optimization\.
We have two key findings through extensive experiments across model architectures and datasets\. First, ManiF\-SMC cuts the reliance on task labels by operating purely in representation space, while achieving strong unlearning effectiveness compared with state\-of\-the\-art approximate unlearning baselines\. This label\-agnostic unlearning formulation also enables deployment in broader settings such as semantic communication systems\(Huang et al\.,[2022](https://arxiv.org/html/2605.22871#bib.bib19); Tian et al\.,[2024](https://arxiv.org/html/2605.22871#bib.bib38)\), where the encoder may serve multiple downstream tasks and labels may be unavailable at the unlearning stage\. Second, we observed that adopting the MMCRs for model training can improve class separability and boost unlearning effectiveness for both existing approximate and exact unlearning methods\.
Figure 2\.Comparison with existing approximate unlearning studies\. \(a\) Existing approximate unlearning studies utilize the data and label information, unlearning by reversing the corresponding gradients\. \(b\) Manifold representation\-based unlearning tries to push away the unlearning sample from its original representation towards the centroid of the most semantic similar samples in the retained set\.Our contributions are summarized as follows\.
- •We revisit approximate unlearning from a representation perspective and reformulate it as moving an erased sample’s representation away from its original position and toward the centroid of its most semantically similar retained samples\.
- •We proposeManifoldForgetting withSelfModeConnectivity \(ManiF\-SMC\), a label\-agnostic unlearning approach that operates purely in representation space and adaptively estimates the centroid and margin for optimization\.
- •Extensive experiments across datasets and architectures show that ManiF\-SMC achieves strong approximate unlearning effectiveness without accessing task labels\. We further demonstrate that MMCR\-based representation regularization improves class separability and consistently boosts the unlearning performance of both approximate and retraining\-based methods\. Our code is available at[https://anonymous\.4open\.science/r/ManiF\-C120](https://anonymous.4open.science/r/ManiF-C120)\.
## 2\.Related Work
Machine unlearning techniques are motivated by the growing privacy concerns of individuals and the corresponding privacy regulations\(Cao and Yang,[2015](https://arxiv.org/html/2605.22871#bib.bib4); Bourtoule et al\.,[2021](https://arxiv.org/html/2605.22871#bib.bib3)\)\. The most legitimate approach is retraining from scratch\(Cao and Yang,[2015](https://arxiv.org/html/2605.22871#bib.bib4); Thudi et al\.,[2022b](https://arxiv.org/html/2605.22871#bib.bib36)\)\. However, this method is often impractical due to the significant computational and storage costs involved, especially for complex deep\-learning tasks\. Consequently, numerous studies have sought to develop efficient unlearning solutions\(Yan et al\.,[2022](https://arxiv.org/html/2605.22871#bib.bib45); Warnecke et al\.,[2024](https://arxiv.org/html/2605.22871#bib.bib42)\)\.
Exact Unlearning\.The exact unlearning methods aim to reduce the computational cost of retraining a new model by redesigning the learning algorithms and storing the intermediate parameters during the learning process\(Cao and Yang,[2015](https://arxiv.org/html/2605.22871#bib.bib4); Bourtoule et al\.,[2021](https://arxiv.org/html/2605.22871#bib.bib3); Yan et al\.,[2022](https://arxiv.org/html/2605.22871#bib.bib45); Wu et al\.,[2020](https://arxiv.org/html/2605.22871#bib.bib44)\)\. One popular exact unlearning is introduced in\(Bourtoule et al\.,[2021](https://arxiv.org/html/2605.22871#bib.bib3)\), where the ML server divided the full data into shards and trained sub\-models separately in each shard\. When unlearning, the server simply needs to remove the erasing data from the corresponding shard and retrain the sub\-model of this shard\(Bourtoule et al\.,[2021](https://arxiv.org/html/2605.22871#bib.bib3)\)\. Exact unlearning has the advantage of completely removing the influence of the unlearned data on the model\. However, these methods sacrifice the huge storage space and are inefficient when removal requests are frequent\.
Approximate Unlearning\.The approximate unlearning methods aim to implement unlearning using only the original model and the samples to be erased, approximating a model as if it was retrained on the remaining dataset\(Nguyen et al\.,[2020](https://arxiv.org/html/2605.22871#bib.bib28); Shen et al\.,[2024](https://arxiv.org/html/2605.22871#bib.bib33); Warnecke et al\.,[2024](https://arxiv.org/html/2605.22871#bib.bib42)\)\. One representative approximate unlearning solution is based on the Hessian matrix and Newton updates\(Warnecke et al\.,[2024](https://arxiv.org/html/2605.22871#bib.bib42); Sekhari et al\.,[2021](https://arxiv.org/html/2605.22871#bib.bib31)\), which ensures that a model from which data is erased cannot be distinguished from a model that never observed the data\. Another representative solutions are based on the Bayesian inference\(Nguyen et al\.,[2020](https://arxiv.org/html/2605.22871#bib.bib28); Fu et al\.,[2022](https://arxiv.org/html/2605.22871#bib.bib13); Nguyen et al\.,[2022](https://arxiv.org/html/2605.22871#bib.bib29)\), which unlearn an approximate posterior only based on the erased samples\.
Moreover, a line of approximate unlearning methods performs post\-hoc model editing by manipulating learned representations or decision boundaries\.Fan et al\.\([2024](https://arxiv.org/html/2605.22871#bib.bib12)\)shifts attention from updating the full network to modifying a targeted subset of weights, improving both effectiveness and efficiency\.Wang et al\.\([2024](https://arxiv.org/html/2605.22871#bib.bib41)\)proposes representation forgetting by operating on a learned bottleneck representation, while boundary unlearning\(Chen et al\.,[2023](https://arxiv.org/html/2605.22871#bib.bib5)\)induces class\-level forgetting by shifting decision boundaries\. In\(Shah et al\.,[2025](https://arxiv.org/html/2605.22871#bib.bib32)\),[Shah et al\.](https://arxiv.org/html/2605.22871#bib.bib32)targeted the class unlearning problem based on sparse representation learned by Discrete Key\-Value Bottleneck\(Träuble et al\.,[2023](https://arxiv.org/html/2605.22871#bib.bib39)\)\.
Despite differing in their intervention points \(weights, representations, or boundaries\), these approaches typically depend on the original task formulation and label supervision to define the forgetting objective and drive the update\. In contrast, we study representation unlearning without using label information\.
## 3\.Priliminary and Problem Reformulation
### 3\.1\.Background of Manifold Capacity Representation Theory
ConsiderPPclass\-labeled manifolds embedded in aDDdimensional feature space\. Manifold capacity theory asks:what is the largest load ratioPD\\frac\{P\}\{D\}for which, with high probability, a single hyperplane can separate an arbitrary \(random\) dichotomy of those manifolds\(Gardner,[1988](https://arxiv.org/html/2605.22871#bib.bib14)\)?Recent work shows that there exists a manifold capacity valueCC, such that the probability of finding a separating hyperplane is approximately 1 whenPD≤C\\frac\{P\}\{D\}\\leq C, and essentially 0 whenPD≥C\\frac\{P\}\{D\}\\geq C\(Chung et al\.,[2018](https://arxiv.org/html/2605.22871#bib.bib9)\)\. This critical capacityCCcan be predicted from three geometric quantities: \(1\) the manifold radiusRMR\_\{M\}, \(2\) the manifold dimensionalityDMD\_\{M\}, and \(3\) the correlation of the manifold centroids\.
When the manifold centroid correlation is low, the manifold capacity value can be approximated byϕ\(RMDM\)\\phi\(R\_\{M\}\\sqrt\{D\_\{M\}\}\), whereϕ\\phiis a monotonically decreasing function\.[Yerxa et al\.](https://arxiv.org/html/2605.22871#bib.bib47)\(Yerxa et al\.,[2023](https://arxiv.org/html/2605.22871#bib.bib47)\)further rewrite the capacity asC≈ϕ\(∑iσi\)C\\approx\\phi\(\\sum\_\{i\}\\sigma\_\{i\}\), whereσi\\sigma\_\{i\}are the singular values of a matrix of sampled manifold points \(equivalently, the square roots of the covariance\-matrix eigenvalues\)\. Maximizing the manifold capacity representation supports high\-quality object recognition\. We demonstrate the comparison of training a self\-supervise model with or without MMCRs\(Yerxa et al\.,[2023](https://arxiv.org/html/2605.22871#bib.bib47)\)on MNIST in[Figure6](https://arxiv.org/html/2605.22871#A1.F6)in[AppendixA](https://arxiv.org/html/2605.22871#A1)\. With MMCRs during model training, the representation will make every class’s points stay close together \(small intra\-class distance\) while different classes lie far apart \(large inter\-class distance\)\. The representation with these properties inspires us to investigate an efficient approximate unlearning method within only the manifold representation\.
### 3\.2\.Problem Reformulation
Notation and Setup\.We summarize most notations in[Table5](https://arxiv.org/html/2605.22871#A1.T5)in[AppendixB](https://arxiv.org/html/2605.22871#A2)\. Letθo\\theta\_\{o\}be the original parameters of the modelfθo\(⋅\)f\_\{\\theta\_\{o\}\}\(\\cdot\)andθu\\theta\_\{u\}be the new parameters after unlearning\. We denote the original representation ofxix\_\{i\}byzi,o=fθo\(xi\)\\texttt\{z\}\_\{i,o\}=f\_\{\\theta\_\{o\}\}\(x\_\{i\}\)and the representation after unlearning aszi,u=fθu\(xi\)\\texttt\{z\}\_\{i,u\}=f\_\{\\theta\_\{u\}\}\(x\_\{i\}\)\. The original training setSSis partitioned into the unlearning setSuS\_\{u\}and the remaining setSrS\_\{r\}\.
Neighborhoods and Local Centroid\.For each erased samplexi∈Sux\_\{i\}\\in S\_\{u\}, letSki⊆SrS\_\{k\}^\{i\}\\subseteq S\_\{r\}be its top\-kkmost similar retained samples, selected using the original representations \(e\.g\., nearest neighbors ofzi,o\\texttt\{z\}\_\{i,o\}among\{zj,o:xj∈Sr\}\\\{\\texttt\{z\}\_\{j,o\}:x\_\{j\}\\in S\_\{r\}\\\}\)\. We define the local centroid of the top\-k similar samples ofxix\_\{i\}on the unlearned model as
\(1\)ci,u=1\|Ski\|∑xki∈Skifθu\(xki\)\.\\texttt\{c\}\_\{i,u\}\\;=\\;\\frac\{1\}\{\|S\_\{k\}^\{i\}\|\}\\sum\_\{x\_\{k\}^\{i\}\\in S\_\{k\}^\{i\}\}f\_\{\\theta\_\{u\}\}\(x\_\{k\}^\{i\}\)\.The original unlearned centroidci,o\\texttt\{c\}\_\{i,o\}forSkS\_\{k\}can also be calculated by[Eq\.1](https://arxiv.org/html/2605.22871#S3.E1)using the original modelfθo\(⋅\)f\_\{\\theta\_\{o\}\}\(\\cdot\)\.
Motivated by the observation in[Figure1](https://arxiv.org/html/2605.22871#S1.F1), we use a manifold view: after retraining onSrS\_\{r\}, an erased sample tends to behave similarly to its semantic neighbors inSrS\_\{r\}\. This suggests that effective approximate unlearning should \(i\) move the erased sample away from its original representation region, while \(ii\) aligning it with similar retained samples\.
Push Away from the Original Representation\.Prior work shows that approximate unlearning can be induced by altering erased samples’ labels or representations\(Fan et al\.,[2024](https://arxiv.org/html/2605.22871#bib.bib12); Chundawat et al\.,[2023](https://arxiv.org/html/2605.22871#bib.bib8); Guo et al\.,[2020](https://arxiv.org/html/2605.22871#bib.bib18)\)\. In representation space, this can be expressed by encouragingzi,u\\texttt\{z\}\_\{i,u\}to move away from its original representation:
\(2\)Push term:maxθu∑xi∈Su‖fθu\(xi\)−zi,o‖2\.\\textbf\{Push term:\}\\quad\\max\_\{\\theta\_\{u\}\}\\;\\sum\_\{x\_\{i\}\\in S\_\{u\}\}\\big\\\|f\_\{\\theta\_\{u\}\}\(x\_\{i\}\)\-\\texttt\{z\}\_\{i,o\}\\big\\\|^\{2\}\.However, as we mentioned before, reversing the label or blindly pushing the representation can conflict with the learning objective and may deviate from retraining behavior\.
Pull Toward Similar Retained Samples\.To align with the retraining observation, we encourage each erased sample to move toward its semantic neighbors inSrS\_\{r\}\. Using the same neighbor setSkiS\_\{k\}^\{i\}, we pullzi,u\\texttt\{z\}\_\{i,u\}toward the centroid of those retained neighbors:
\(3\)Pull term:minθu∑xi∈Su‖fθu\(xi\)−ci,u‖2\.\\textbf\{Pull term:\}\\quad\\min\_\{\\theta\_\{u\}\}\\;\\sum\_\{x\_\{i\}\\in S\_\{u\}\}\\big\\\|f\_\{\\theta\_\{u\}\}\(x\_\{i\}\)\-\\texttt\{c\}\_\{i,u\}\\big\\\|^\{2\}\.
Representation\-based Approximate Unlearning\.Together, the push and pull terms \([Eqs\.2](https://arxiv.org/html/2605.22871#S3.E2)and[3](https://arxiv.org/html/2605.22871#S3.E3)\) define a representation\-space approximation to standard retraining: the erased samples are displaced from their original local region while being absorbed into the manifold supported by similar retained data\. This reformulation operates purely in representation space and therefore reduces reliance on label perturbations or task\-specific gradient manipulation\.
Figure 3\.ManiF\-SMC for manifold forgetting\. ManiF forgets a target sample by pushing itself away from the original representation toward the centroid of its top\-kknearest retained samples with marginα\\alpha\. To obtain a suitableα\\alphafor ManiF optimization, we propose self mode connectivity, which links the unlearned model and the original model using only the top\-k similar remaining samples, reconstructing a new local representation without the erased data\. We use the local representation reconstructed by the new connected model to guide the adaptive margin calculation\.
## 4\.The Manifold Forgetting Guided by Self Mode Connectivity
To address the representation\-based unlearning objective from a manifold perspective, we propose ManiF\-SMC \([Figure3](https://arxiv.org/html/2605.22871#S3.F3)\), which consists of two components:
- •Manifold contrastive forgetting \(ManiF\):a representation\-space unlearning method\.
- •Self mode connectivity \(SMC\):a module that adaptively calculates the contrastive margin to guide the ManiF optimization\.
### 4\.1\.Manifold Contrastive Forgetting \(ManiF\)
Recall from the previous introduction that the representation\-based unlearning goal is maximizing∑i∈Su‖fθu\(xi\)−zi,o‖2,\\sum\_\{i\\in S\_\{u\}\}\\\|f\_\{\\theta\_\{u\}\}\(x\_\{i\}\)\-\\texttt\{z\}\_\{i,o\}\\\|^\{2\},and minimizing∑i∈Su‖fθu\(xi\)−ci,u‖2\\sum\_\{i\\in S\_\{u\}\}\\\|f\_\{\\theta\_\{u\}\}\(x\_\{i\}\)\-\\texttt\{c\}\_\{i,u\}\\\|^\{2\}\. The relational push\-pull objective is naturally integrated into a triplet or margin\-ranking formulation\(Schroff et al\.,[2015](https://arxiv.org/html/2605.22871#bib.bib30); Weinberger and Saul,[2009](https://arxiv.org/html/2605.22871#bib.bib43)\)\. We can formulate a margin\-based triplet loss for representation\-based approximate unlearning as
\(4\)ℒtriplet\(θu\)=∑xi∈Su\[dist\(fθu\(xi\),ci,u\)−dist\(fθu\(xi\),zi,o\)\+α\]\+,\\displaystyle\\mathcal\{L\}\_\{\\text\{triplet\}\}\(\\theta\_\{u\}\)\\;=\\;\\sum\_\{x\_\{i\}\\in S\_\{u\}\}\\Bigl\[\\,\\mathrm\{dist\}\\bigl\(f\_\{\\theta\_\{u\}\}\(x\_\{i\}\),\\,\\texttt\{c\}\_\{i,u\}\\bigr\)\\;\-\\;\\mathrm\{dist\}\\bigl\(f\_\{\\theta\_\{u\}\}\(x\_\{i\}\),\\,\\texttt\{z\}\_\{i,o\}\\bigr\)\\;\+\\;\\alpha\\Bigr\]\_\{\+\},where\[⋅\]\+=max\{⋅,0\}\[\\,\\cdot\\,\]\_\{\+\}=\\max\\\{\\,\\cdot,\\,0\\\}, andα\>0\\alpha\>0is a margin\. Here,dist\(⋅,⋅\)\\mathrm\{dist\}\(\\cdot,\\cdot\)is commonly the Euclidean distance, and other metrics such as cosine similarity, Kullback\-Leibler Divergence, and mutual information could be used too\. Minimizingℒtriplet\\mathcal\{L\}\_\{\\text\{triplet\}\}enforces:
\(5\)dist\(fθu\(xi\),ci,u\)\+α≤dist\(fθu\(xi\),zi,o\)\.\\mathrm\{dist\}\(f\_\{\\theta\_\{u\}\}\(x\_\{i\}\),\\,\\texttt\{c\}\_\{i,u\}\)\\;\+\\;\\alpha\\;\\;\\leq\\;\\;\\mathrm\{dist\}\(f\_\{\\theta\_\{u\}\}\(x\_\{i\}\),\\,\\texttt\{z\}\_\{i,o\}\)\.Hence, the unlearned new representation forxix\_\{i\}is at leastα\\alphacloser to the top\-k remaining similar samples’ centroid than to the original representation\. Based on the model previously learned compact manifold representation, we have the following unlearning distance guarantee\.
Unlearning Lower Bound of ManiF\.The triplet loss in[Eq\.4](https://arxiv.org/html/2605.22871#S4.E4)is a sum of hinge terms and is always non\-negative\. Hence,ℒtriplet\(θu\)=0\\mathcal\{L\}\_\{\\text\{triplet\}\}\(\\theta\_\{u\}\)=0if and only if each hinge term is zero, i\.e\., for allxi∈Sux\_\{i\}\\in S\_\{u\},
\(6\)dist\(fθu\(xi\),ci,u\)\+α≤dist\(fθu\(xi\),zi,o\)\.\\mathrm\{dist\}\(f\_\{\\theta\_\{u\}\}\(x\_\{i\}\),\\,\\texttt\{c\}\_\{i,u\}\)\\;\+\\;\\alpha\\;\\;\\leq\\;\\;\\mathrm\{dist\}\(f\_\{\\theta\_\{u\}\}\(x\_\{i\}\),\\,\\texttt\{z\}\_\{i,o\}\)\.Thus,α\\alphaprovides a margin\-based lower bound: the unlearned representation ofxix\_\{i\}is at leastα\\alphacloser to the retained\-neighbor centroid than to its original representation \(underdist\\mathrm\{dist\}\)\.
Ideally, if we have a retrained model onSrS\_\{r\}, we can use it to estimate the post\-retraining neighbor centroidci,u\\texttt\{c\}\_\{i,u\}and choose a suitable marginα\\alpha\. However, we don’t have the retrained model\.Can we have a fast replacement?We propose to use the local mode connectivity\(Garipov et al\.,[2018](https://arxiv.org/html/2605.22871#bib.bib15); Zhao et al\.,[2020](https://arxiv.org/html/2605.22871#bib.bib48)\)to fastly reconstruct a local manifold representation for the top\-k most similar samplesSkiS\_\{k\}^\{i\}in the remaining dataset\.
### 4\.2\.Adaptive Margin for ManiF Guided by Self Mode Connectivity
Intuition\.In our problem, we need a surrogate model to reconstruct a local representation that is largely unaffected by the unlearned samples, so that we can use it to guide the post\-unlearning neighbor centroid and margin calculation\. Mode connectivity shows that well\-trained solutions can be linked by a low\-loss path, enabling efficient sampling of high\-performing models without full retraining\(Garipov et al\.,[2018](https://arxiv.org/html/2605.22871#bib.bib15)\)\. Prior work demonstrates that when two models are poisoned, learning a path between them using only a limited amount of benign data can produce intermediate models that mitigate the adversarial behavior while preserving clean accuracy\(Zhao et al\.,[2020](https://arxiv.org/html/2605.22871#bib.bib48)\)\.
Motivated by this, we learn a local connecting path using only the retained neighborhood data \(e\.g\.,Sk=∪xi∈SuSkiS\_\{k\}=\\cup\_\{x\_\{i\}\\in S\_\{u\}\}S\_\{k\}^\{i\}\) and use a sampled model on this path to compute the centroid and an adaptive margin\.
Self Mode Connectivity via a Quadratic Bézier Path\.Given endpointsθu\\theta\_\{u\}andθo\\theta\_\{o\}, we parameterize a quadratic Bézier curve
\(7\)ϕw\(t\)=\(1−t\)2θu\+2t\(1−t\)w\+t2θo,0≤t≤1,\\phi\_\{w\}\(t\)=\(1\-t\)^\{2\}\\theta\_\{u\}\+2t\(1\-t\)w\+t^\{2\}\\theta\_\{o\},\\quad 0\\leq t\\leq 1,wherewwis the learnable control point\. We learnwwby minimizing the retained loss along the path \(computed onSkS\_\{k\}\), e\.g\.,
\(8\)minw𝔼t∼𝒰\[0,1\]\[ℒSk\(ϕw\(t\)\)\],\\min\_\{w\}\\ \\mathbb\{E\}\_\{t\\sim\\mathcal\{U\}\[0,1\]\}\\big\[\\mathcal\{L\}\_\{S\_\{k\}\}\(\\phi\_\{w\}\(t\)\)\\big\],wheret∼𝒰\[0,1\]t\\sim\\mathcal\{U\}\[0,1\]meansttis sampled from the uniform distribution on the interval\[0,1\]\[0,1\]\. We then select a surrogate model on the path,θ~=ϕw\(t⋆\)\\tilde\{\\theta\}=\\phi\_\{w\}\(t^\{\\star\}\)\(we uset⋆=0\.5t^\{\\star\}=0\.5in practice\), to estimate retained\-neighborhood representations\.
Centroid and Adaptive Margin\.By usingθ~\\tilde\{\\theta\}to replace theθu\\theta\_\{u\}in[Eq\.1](https://arxiv.org/html/2605.22871#S3.E1), we compute the neighbor centroidci,θ~\\texttt\{c\}\_\{i,\\tilde\{\\theta\}\}:
\(9\)ci,θ~=1\|Ski\|∑xj∈Skifθ~\(xj\)\.\\texttt\{c\}\_\{i,\\tilde\{\\theta\}\}=\\frac\{1\}\{\|S\_\{k\}^\{i\}\|\}\\sum\_\{x\_\{j\}\\in S\_\{k\}^\{i\}\}f\_\{\\tilde\{\\theta\}\}\(x\_\{j\}\)\.Moreover, based onθ~\\tilde\{\\theta\}, we can set a sample\-wise adaptive margin to replace the original fixedα\\alphaas
\(10\)αθ~\(xi,ci,θ~,zi,o\)=\[dist\(fθ~\(xi\),zi,o\)−dist\(fθ~\(xi\),ci,θ~\)\]\+,\\alpha\_\{\\tilde\{\\theta\}\}\(x\_\{i\},\\texttt\{c\}\_\{i,\\tilde\{\\theta\}\},\\texttt\{z\}\_\{i,o\}\)=\\Big\[\\mathrm\{dist\}\\big\(f\_\{\\tilde\{\\theta\}\}\(x\_\{i\}\),\\,\\texttt\{z\}\_\{i,o\}\\big\)\-\\mathrm\{dist\}\\big\(f\_\{\\tilde\{\\theta\}\}\(x\_\{i\}\),\\,\\texttt\{c\}\_\{i,\\tilde\{\\theta\}\}\\big\)\\Big\]\_\{\+\},and update ManiF as
\(11\)ℒtriplet\(θu\)=\\displaystyle\\mathcal\{L\}\_\{\\text\{triplet\}\}\(\\theta\_\{u\}\)=∑xi∈Su\[dist\(fθu\(xi\),ci,θ~\)\\displaystyle\\sum\_\{x\_\{i\}\\in S\_\{u\}\}\\Big\[\\mathrm\{dist\}\\big\(f\_\{\\theta\_\{u\}\}\(x\_\{i\}\),\\,\\texttt\{c\}\_\{i,\\tilde\{\\theta\}\}\\big\)−dist\(fθu\(xi\),zi,o\)\+αθ~\(xi,ci,θ~,zi,o\)\]\+\.\\displaystyle\-\\mathrm\{dist\}\\big\(f\_\{\\theta\_\{u\}\}\(x\_\{i\}\),\\,\\texttt\{z\}\_\{i,o\}\\big\)\+\\alpha\_\{\\tilde\{\\theta\}\}\(x\_\{i\},\\texttt\{c\}\_\{i,\\tilde\{\\theta\}\},\\texttt\{z\}\_\{i,o\}\)\\Big\]\_\{\+\}\.We call the resulting method ManiF\-SMC and present the whole ManiF\-SMC pseudocode in[AppendixC](https://arxiv.org/html/2605.22871#A3)\.
Enhancement of Self\-Mode\-Connectivity \(SMC\)\.ManiF\-SMC uses SMC to obtain a fast surrogate modelθ~\\tilde\{\\theta\}\(sampled on a low\-loss Bézier path\) that approximates the retained\-data geometry without retraining\. This surrogate is used for both \(i\) estimating the retained\-neighbor centroidci,θ~\\texttt\{c\}\_\{i,\\tilde\{\\theta\}\}and \(ii\) setting a sample\-wise adaptive marginαθ~\(xi,ci,θ~,zi,o\)\\alpha\_\{\\tilde\{\\theta\}\}\(x\_\{i\},\\texttt\{c\}\_\{i,\\tilde\{\\theta\}\},\\texttt\{z\}\_\{i,o\}\)\.
###### Proposition0 \(Logit drift along the SMC path\)\.
Assumeg\(θ,x\)g\(\\theta,x\)isLxL\_\{x\}\-Lipschitz inθ\\theta\. Letθ~=ϕw\(t⋆\)\\tilde\{\\theta\}=\\phi\_\{w\}\(t^\{\\star\}\)on[Eq\.7](https://arxiv.org/html/2605.22871#S4.E7)\. Then
\(12\)\|g\(θ~,x\)−g\(θo,x\)\|≤Lx\(\(1−t⋆\)2‖θu−θo‖2\+2t⋆\(1−t⋆\)‖w−θo‖2\)\.\\bigl\|g\(\\tilde\{\\theta\},x\)\-g\(\\theta\_\{o\},x\)\\bigr\|\\leq L\_\{x\}\\Big\(\(1\-t^\{\\star\}\)^\{2\}\\\|\\theta\_\{u\}\-\\theta\_\{o\}\\\|\_\{2\}\+2t^\{\\star\}\(1\-t^\{\\star\}\)\\\|w\-\\theta\_\{o\}\\\|\_\{2\}\\Big\)\.
This smoothness implies that retained\-sample predictions \(and hence retained\-neighbor representations\) vary continuously along the path\. Therefore, usingθ~=ϕw\(t⋆\)\\tilde\{\\theta\}=\\phi\_\{w\}\(t^\{\\star\}\)yields a stable estimate of the retained\-neighborhood centroidci,θ~\\texttt\{c\}\_\{i,\\tilde\{\\theta\}\}, and the adaptive margin computed underθ~\\tilde\{\\theta\}reflects the local retained geometry more faithfully than a fixed global margin\.
## 5\.Experiments
In this section, we conduct experiments to answer the following research questions \(RQs\) and evaluate ManiF\-SMC:
- •*RQ1*: How does ManiF\-SMC perform in terms of unlearning effectiveness and efficiency compared with state\-of\-the\-art approximate unlearning methods? \(See[Section5\.2](https://arxiv.org/html/2605.22871#S5.SS2)\)
- •*RQ2*: What are the contributions of key components \(e\.g\., MMCR and SMC\-based adaptive margin\), and how do different parameters and metrics influence the ManiF\-SMC? \(See[Sections5\.3](https://arxiv.org/html/2605.22871#S5.SS3)and[5\.4](https://arxiv.org/html/2605.22871#S5.SS4)\)
- •*RQ3*: How well does ManiF\-SMC generalize to more challenging settings, such as unlearning for generative models and label\-limited scenarios where task labels are unavailable? \(See[Sections5\.5](https://arxiv.org/html/2605.22871#S5.SS5)and[5\.6](https://arxiv.org/html/2605.22871#S5.SS6)\)
### 5\.1\.Experimental Setting
Datasets\.We have conducted experiments on four widely adopted public datasets: MNIST\(Deng,[2012](https://arxiv.org/html/2605.22871#bib.bib10)\), CIFAR10\(Krizhevsky et al\.,[2009](https://arxiv.org/html/2605.22871#bib.bib22)\), CelebA\(Liu et al\.,[2018](https://arxiv.org/html/2605.22871#bib.bib25)\), and Tiny\-ImageNet\(Le and Yang,[2015](https://arxiv.org/html/2605.22871#bib.bib23)\), offering a range of objective categories with varying levels of learning complexity\. We present detailed statistics of all datasets and how do we use them in[AppendixD](https://arxiv.org/html/2605.22871#A4)\.
Models\.We select three model architectures of different sizes in our experiments: a 5\-layer multi\-layer perceptron \(MLP\) connected by ReLU, a 7\-layer convolutional neural network \(CNN\), and ResNet\-18\. We use ResNet\-18 as an encoder to learn the manifold representation and connect with a MLP for task models on MNIST and a CCN for task models on CIFAR10, CelebA, and Tiny\-ImageNet\. We set the dimensionality of learned manifold representationDM=5D\_\{M\}=5on MNIST, andDM=64D\_\{M\}=64on CIFAR10 and CelebA, andDM=256D\_\{M\}=256on Tiny\-ImageNet\. During training, we set the minibatch size to1616on MNIST, CIFAR10, and CelebA, and the minibatch size to200200on Tiny\-Imagenet\. All the experiments are conducted on NVIDIA Quadro RTX 6000 GPUs\.
Metric\.Existing studies have assessed machine unlearning performance from different aspects\(Graves et al\.,[2021](https://arxiv.org/html/2605.22871#bib.bib17); Golatkar et al\.,[2020](https://arxiv.org/html/2605.22871#bib.bib16)\)\. By carefully reviewing the prior arts, we focus on the empirical metrics including Membership inference attack\(MIA\), Remaining accuracy\(RA\), Testing accuracy\(TA\), Remaining mean\-squared error\(R\-MSE\), Testing mean\-squared error\(T\-MSE\), and Running Time\(RT\)\. We demonstrate the detailed introduction of these metrics in[AppendixE](https://arxiv.org/html/2605.22871#A5)\.
Compared Unlearning Benchmarks\.We compare our method with four mainstream unlearning algorithms:Retraining,GA\(Graves et al\.,[2021](https://arxiv.org/html/2605.22871#bib.bib17); Thudi et al\.,[2022a](https://arxiv.org/html/2605.22871#bib.bib35)\),VBU\(Nguyen et al\.,[2020](https://arxiv.org/html/2605.22871#bib.bib28)\),RFU\(Wang et al\.,[2024](https://arxiv.org/html/2605.22871#bib.bib41)\)andSalUn\(Fan et al\.,[2024](https://arxiv.org/html/2605.22871#bib.bib12)\)\. The corresponding introduction of each method is provided[AppendixF](https://arxiv.org/html/2605.22871#A6)\.
\(a\)
\(b\)
\(c\)
\(d\)
Figure 4\.Representation Space of GA, VBU, SalUn and ManiF\-SMC on CIFAR10 after unlearning 1% \(500\) randomly selected samples\. Small points denote retained samples from different clusters\. Larger points denote forgotten samples, assigned to the retained cluster with the highest semantic similarity\.On MNIST

\(a\)
\(b\)
\(c\)
\(d\)
On CIFAR10

\(e\)
\(f\)
\(g\)
\(h\)
On TinyImageNet

\(i\)
\(j\)
\(k\)
\(l\)
Figure 5\.Overall of unlearning performance on MNIST, CIFAR10, and TinyImageNet\. ManiF\-SMC is the only method that unlearns without labels; the others require full inputs and labels\. SalUn maintains higher utility at larger𝑈𝑆𝑆\\it USSby fine\-tuning on retained data\. ManiF\-SMC can be improved similarly \(see[SectionG\.2](https://arxiv.org/html/2605.22871#A7.SS2)and[Figure9](https://arxiv.org/html/2605.22871#A7.F9)\)\.
### 5\.2\.Overview Evaluations
Setup\.A common evaluation of unlearning is to test the unlearning effectiveness of different unlearning sample sizes \(USS\)\. We evaluate how the different unlearning methods perform in variousUSS, settings from 200 to 1200, where 1200 is around2%2\\%training data on MNIST and CIFAR10, already large enough for unlearning according to\(Chen et al\.,[2021](https://arxiv.org/html/2605.22871#bib.bib6); Bertram et al\.,[2019](https://arxiv.org/html/2605.22871#bib.bib2)\)\.
Evaluation of Effectiveness\.[Figure4](https://arxiv.org/html/2605.22871#S5.F4)visualizes the unlearned representation space on CIFAR10 when unlearning1%1\\%randomly selected samples\. Since GA and VBU implement unlearning with a gradient reversal term, this conflicts with the original learning algorithm, and they increase the intra\-class distance\. SalUn includes the fine\-tuning using the remaining dataset\. Hence, SalUn has a more compact class representation than GA and VBU\. ManiF\-SMC preserves a compact class structure and moves erased samples toward their most semantically similar neighbors in the retained set\. Its representation geometry closely resembles SalUn and retraining from scratch \([Figure1](https://arxiv.org/html/2605.22871#S1.F1)\), while requiring no class labels and no fine\-tuning on the retained data\.
The first to third columns in[Figure5](https://arxiv.org/html/2605.22871#S5.F5)illustrate how each method’s unlearning performance evolves as the unlearning sample size𝑈𝑆𝑆\\it USSincreases on MNIST, CIFAR10, and Tiny\-ImageNet\. Additional results on CelebA are presented in[Figure7](https://arxiv.org/html/2605.22871#A4.F7)in[SectionG\.1](https://arxiv.org/html/2605.22871#A7.SS1)\. In the first column of plots, we observe that the MIA rate rises with largerUSSin all methods, indicating that removing more data can increase the unlearning effectiveness, a higher rate that MIA recognizes not in the training dataset\. All unlearning methods \(ManiF\-SMC, VBU, and SalUn\) can effectively unlearn the original models\.
The second and third columns track RA on the retrained data and TA on the test data, respectively\. As𝑈𝑆𝑆\\it USSincreases, both ManiF\-SMC and VBU show slight drops in RA and TA, which indicates that these two approximate unlearning methods will degrade model utility to some extent\. ManiF\-SMC tends to preserve accuracy better than VBU on these datasets and only uses the manifold representation\.
SalUn can preserve a good model utility when𝑈𝑆𝑆\\it USSincreases because it utilizes the remaining data to fine\-tune the model during unlearning\. We can also add the finetuning of the remaining dataset to mitigate the model utility degradation of ManiF\-SMC, where the corresponding results are presented in[SectionG\.2](https://arxiv.org/html/2605.22871#A7.SS2)\.
Impact on Efficiency\.The fourth column in[Figure5](https://arxiv.org/html/2605.22871#S5.F5)provides a detailed view of the running time, shown on a logarithmic scale\. These results reveal that retraining is by far the most time\-consuming approach, requiring on the order of more than10210^\{2\}seconds on MNIST, and10310^\{3\}to10410^\{4\}seconds on CIFAR10 and TinyImageNet\. In contrast, ManiF\-SMC, VBU, and SalUn remain under one second, even for largerUSSvalues\.
Table 1\.Ablation study of various machine unlearning methods with \(w\) and without \(w/o\) MMCRs\. We randomly choose 200 samples from different classes to unlearn\. The results withbluecolor show how much the MMCRs improve, and the results withredcolor show the negative influence caused by MMCRs\. Best and second\-best results are highlighted inBoldanditalic, respectively\. ManiF\-SMC generally ranks first or second\.DatasetsUnlearningMethodsMIA\(%\\%\)RA \(%\\%\)TA \(%\\%\)RT\(second\)w/o MMCRsw MMCRsw/o MMCRsw MMCRsw/o MMCRsw MMCRsw/o MMCRsw MMCRsOn MNISTRetraining56\.0063\.00\(↑\\uparrow7\.00\)99\.2499\.49 \(↑\\uparrow0\.25\)99\.0399\.23\(↑\\uparrow0\.20\)464\.9470\.4 \(↑\\uparrow5\.5\)GA60\.0064\.50\(↑\\uparrow4\.50\)99\.0899\.01 \(↓\\downarrow0\.07\)98\.7798\.81 \(↑\\uparrow0\.04\)0\.2200\.202\(↓\\downarrow0\.018\)VBU56\.0057\.00 \(↑\\uparrow1\.00\)99\.3099\.37 \(↑\\uparrow0\.07\)99\.0599\.15 \(↑\\uparrow0\.10\)0\.1980\.201\(↑\\uparrow0\.003\)RFU51\.0053\.50 \(↑\\uparrow2\.50\)99\.2899\.39\(↑\\uparrow0\.11\)99\.2199\.28\(↑\\uparrow0\.07\)0\.3420\.354 \(↑\\uparrow0\.012\)SalUn53\.5055\.00 \(↑\\uparrow1\.50\)99\.3199\.37 \(↑\\uparrow0\.06\)99\.1799\.22 \(↑\\uparrow0\.05\)1\.8321\.839 \(↑\\uparrow0\.007\)ManiF\-SMC \(Our\)59\.0062\.00 \(↑\\uparrow3\.00\)99\.4099\.59\(↑\\uparrow0\.19\)99\.0299\.15 \(↑\\uparrow0\.13\)1\.4861\.482 \(↓\\downarrow0\.004\)On CIFAR10Retraining61\.0061\.00\(↑\\uparrow0\.00\)98\.9299\.28\(↑\\uparrow0\.36\)82\.2482\.48\(↑\\uparrow0\.24\)1742\.01783\.4 \(↑\\uparrow41\.4\)GA51\.0054\.00 \(↑\\uparrow3\.00\)97\.5197\.83 \(↑\\uparrow0\.32\)78\.9279\.96 \(↑\\uparrow1\.04\)0\.2850\.273\(↓\\downarrow0\.012\)VBU54\.0056\.00 \(↑\\uparrow2\.00\)97\.1297\.82 \(↑\\uparrow0\.70\)79\.3780\.23 \(↑\\uparrow0\.86\)0\.3240\.331\(↑\\uparrow0\.007\)RFU55\.0056\.00 \(↑\\uparrow1\.00\)97\.7898\.57 \(↑\\uparrow0\.79\)79\.2480\.89 \(↑\\uparrow1\.65\)0\.5030\.522 \(↑\\uparrow0\.019\)SalUn54\.0055\.00 \(↑\\uparrow1\.00\)98\.7898\.85 \(↑\\uparrow0\.07\)80\.4280\.83 \(↑\\uparrow0\.41\)2\.4232\.432 \(↑\\uparrow0\.009\)ManiF\-SMC \(Our\)55\.0059\.00\(↑\\uparrow4\.00\)98\.8699\.04\(↑\\uparrow0\.18\)80\.5881\.75\(↑\\uparrow1\.17\)2\.4502\.372 \(↓\\downarrow0\.078\)
MIA: Membership Inference Attack;RA: Remaining Accuracy;TA: Testing Accuracy;RT: Runing Time;
### 5\.3\.Ablation Study: Can MMCRs Improve Approximate Unlearning Effectiveness?
Setup\.We study the impact of Maximum Manifold Capacity Representations \(MMCRs\)\(Yerxa et al\.,[2023](https://arxiv.org/html/2605.22871#bib.bib47)\)on the performance of various machine unlearning methods\. We compared the methods to unlearn the original model trained with \(w\) and without \(w/o\) MMCRs\. We randomly choose 200 samples from different classes as the unlearning dataset, and the corresponding results on MNIST and CIFAR10 are demonstrated in[Table1](https://arxiv.org/html/2605.22871#S5.T1)\.
Comparison between “w” and “w/o” MMCRs\.When comparing the use of “w” and “w/o” MMCRs, it is evident that nearly all methods experience improved performance in MIA, RA, and TA when MMCRs are included\. For example, on the CIFAR10 dataset, all unlearning methods \(Retraining, GA, VBU, RFU, SalUn, and ManiF\-SMC\) show enhancements in all three effectiveness metrics when MMCRs are employed\. MMCRs have improvements not only for approximate unlearning methods, it also improve the model utility of the retraining method\. The cost is that the MMCRs will increase the running time for the original model training, hence, the retraining method will be influenced by more computation cost when employing MMCRs\. However, since most approximate unlearning methods are not related to the original training process, employing MMCRs will not have too much influence on computation for these approximate unlearning methods\.
Comparison between ManiF\-SMC and Other Unlearning Methods\.Existing approximate unlearning methods have different pros and cons\. Let us focus on the regime of the “w” MMCRs scenario\. We observe that ManiF\-SMC achieves the best MIA, RA, and TA on CIFAR10, except for retraining, which has a tradeoff with its efficiency \(RT\)\. Although RFU and SalUn achieve better effectiveness than ManiF\-SMC on MNIST, we should notice that ManiF\-SMC is an approximate unlearning method that only focuses on the data and representations, without relying on the label information and the remaining dataset for fine\-tuning\. Moreover, GA yields the worst RA and TA since it directly inverses the task loss on the unlearning data, which is harmful to the model utility but beneficial for unlearning \(with a high MIA\)\.
### 5\.4\.Ablation Study: Adaptive Marigin by Self Mode Connectivity versus Fixed Margin
Setup\.We conduct the ablation study on MNIST and CIFAR10 to evaluate the effectiveness of the proposed adaptive margin guided by self mode connectivity \(ManiF\-SMC\) and pure ManiF\. For pure ManiF, we set a fixed margin value of 0\.01\. We evaluate the methods in theUSSfrom 200 to 1000, and other settings are the same as those introduced above\.
Results\.In[Table3](https://arxiv.org/html/2605.22871#S5.T3), the MIA of ManiF\-SMC rises by around1%1\\%–5%5\\%for every unlearning sample size on both MNIST and CIFAR10, showing that erased samples have a higher possibility of being classified not in the training dataset after unlearning\. Meanwhile, ManiF\-SMC preserves much better RA and TA than ManiF, especially when the unlearning size is large, such asUSSis 800 or 1000\. With the achievements above, the running time cost is also higher for ManiF\-SMC, as it additionally executes the self mode connectivity module to calculate the margin\.
We put additional ablation studies to evaluate the influence of different distance metrics, neighborhood sizekk, and samplingttof SMC in Appendix, in[SectionsG\.3](https://arxiv.org/html/2605.22871#A7.SS3),[G\.4](https://arxiv.org/html/2605.22871#A7.SS4)and[G\.5](https://arxiv.org/html/2605.22871#A7.SS5), respectively\.
Table 2\.Unlearning VAEs with ManiF\-SMC\.MetricsOriginalUSS= 2004006008001000On MNISTMIA49\.39%59\.50%58\.50%61\.60%64\.49%65\.53%R\-MSE0\.03570\.03580\.03640\.03660\.03700\.0377T\-MSE0\.03600\.03610\.03670\.03690\.03770\.0378GeneratedSamples![[Uncaptioned image]](https://arxiv.org/html/2605.22871v1/Contents/figures/original_mnist.jpg)![[Uncaptioned image]](https://arxiv.org/html/2605.22871v1/Contents/figures/200mnist.jpg)![[Uncaptioned image]](https://arxiv.org/html/2605.22871v1/Contents/figures/400mnist.jpg)![[Uncaptioned image]](https://arxiv.org/html/2605.22871v1/Contents/figures/600mnist.jpg)![[Uncaptioned image]](https://arxiv.org/html/2605.22871v1/Contents/figures/800mnist.jpg)![[Uncaptioned image]](https://arxiv.org/html/2605.22871v1/Contents/figures/1000mnist.jpg)On CIFAR10MIA53\.83%54\.00%58\.52%57\.59%61\.15%66\.99%R\-MSE0\.001920\.001920\.001940\.001970\.002130\.00223T\-MSE0\.001920\.001920\.001950\.001990\.002160\.00224GeneratedSamples![[Uncaptioned image]](https://arxiv.org/html/2605.22871v1/Contents/figures/original_cifar10.jpg)![[Uncaptioned image]](https://arxiv.org/html/2605.22871v1/Contents/figures/200cifar10.jpg)![[Uncaptioned image]](https://arxiv.org/html/2605.22871v1/Contents/figures/400cifar10.jpg)![[Uncaptioned image]](https://arxiv.org/html/2605.22871v1/Contents/figures/600cifar10.jpg)![[Uncaptioned image]](https://arxiv.org/html/2605.22871v1/Contents/figures/800cifar10.jpg)![[Uncaptioned image]](https://arxiv.org/html/2605.22871v1/Contents/figures/1000cifar10.jpg)
Table 3\.Ablation evaluation of the adaptive margin by self\-mode\-connectivity \(SMC\)\. The results withbluecolor show how much the SMC improves, and the results withredcolor show the negative influence caused by SMC\.DatasetsUnlearningSample SizeMIA\(%\\%\)RA \(%\\%\)TA \(%\\%\)RT\(second\)ManiFFixed MarginManiF\-SMCAdaptive MarginManiFFixed MarginManiF\-SMCAdaptive MarginManiFFixed MarginManiF\-SMCAdaptive MarginManiFFixed MarginManiF\-SMCAdaptive MarginOn MNIST20061\.0062\.00 \(↑\\uparrow1\.00\)99\.5599\.59 \(↑\\uparrow0\.04\)99\.1099\.15 \(↑\\uparrow0\.05\)1\.331\.48 \(↑\\uparrow0\.15\)40059\.5061\.50 \(↑\\uparrow2\.00\)99\.4699\.49 \(↑\\uparrow0\.03\)99\.1099\.12 \(↑\\uparrow0\.02\)2\.593\.10 \(↑\\uparrow0\.51\)60058\.3562\.32 \(↑\\uparrow3\.97\)99\.4399\.49 \(↑\\uparrow0\.06\)99\.0499\.11 \(↑\\uparrow0\.07\)4\.024\.65 \(↑\\uparrow0\.63\)80059\.3861\.24 \(↑\\uparrow1\.86\)99\.0899\.46 \(↑\\uparrow0\.38\)99\.0299\.13 \(↑\\uparrow0\.11\)5\.796\.78 \(↑\\uparrow0\.99\)100059\.6063\.39 \(↑\\uparrow3\.79\)98\.9399\.43 \(↑\\uparrow0\.50\)98\.8799\.11 \(↑\\uparrow0\.24\)6\.537\.56 \(↑\\uparrow1\.03\)On CIFAR1020056\.0059\.00 \(↑\\uparrow3\.00\)98\.9699\.04 \(↑\\uparrow0\.08\)81\.7181\.75 \(↑\\uparrow0\.04\)2\.022\.37 \(↑\\uparrow0\.35\)40058\.7863\.07 \(↑\\uparrow4\.29\)98\.5098\.46 \(↓\\downarrow0\.04\)80\.9780\.92 \(↓\\downarrow0\.05\)4\.525\.19 \(↑\\uparrow0\.67\)60063\.1966\.98 \(↑\\uparrow3\.79\)97\.1697\.68 \(↑\\uparrow0\.52\)79\.6080\.25 \(↑\\uparrow0\.65\)6\.407\.09 \(↑\\uparrow0\.69\)80063\.8967\.88 \(↑\\uparrow3\.99\)96\.1297\.14 \(↑\\uparrow1\.02\)78\.1079\.09 \(↑\\uparrow0\.99\)8\.579\.32 \(↑\\uparrow0\.75\)100065\.3369\.47 \(↑\\uparrow4\.14\)89\.8094\.91 \(↑\\uparrow5\.11\)72\.3477\.00 \(↑\\uparrow4\.66\)10\.2511\.83 \(↑\\uparrow1\.58\)
MIA: Membership Inference Attack;RA: Remaining Accuracy;TA: Testing Accuracy;RT: Runing Time;
### 5\.5\.Application: Can ManiF\-SMC Unlearn Generative Models?
Setup\.Generative tasks are also a key category of machine\-learning services, so we evaluate ManiF\-SMC in a generative learning setting\. Specifically, we deploy ManiF\-SMC to unlearn a VAE model\(Kingma and Welling,[2014](https://arxiv.org/html/2605.22871#bib.bib21)\)and evaluate its performance with three metrics: the MIA accuracy to quantify unlearning effectiveness, and the mean\-squared reconstruction error on the remaining \(R\-MSE\) and test \(T\-MSE\) sets to measure utility\. The corresponding results for MNIST and CIFAR10 are shown in[Table2](https://arxiv.org/html/2605.22871#S5.T2)\.
Results\.In[Table2](https://arxiv.org/html/2605.22871#S5.T2), the MIA metric exhibits a clear upward trend whenUSSincreases on both MNIST and CIFAR10, demonstrating an effective unlearning of ManiF\-SMC\. Both R\-MSE and T\-MSE increase compared to the original model, showing the effective unlearning of ManiF\-SMC while slight degradations of model utility, and from the observation of generated samples, the generative VAE model still keeps high\-quality sample recovery after unlearning\.
### 5\.6\.Application: Unlearning when Having Limited Access to Task Label Information
There are practical scenarios like semantic communications, which train encoders and decoders jointly by sender and receiver\(Huang et al\.,[2022](https://arxiv.org/html/2605.22871#bib.bib19)\)\. Unlearning requirements would be common for the sender but are impractical to implement as the sender can only access the encoder\. Existing unlearning methods typically need access to encoder and decoder and training task information to design corresponding knowledge removal methods, commonly having the gradient ascent term\(Guo et al\.,[2020](https://arxiv.org/html/2605.22871#bib.bib18)\)\. Here, we conducted experiments to evaluate the applications of different methods in semantic communication scenarios when with or without the accessibility of the decoder\.
We present the results of unlearning semantic models trained on CelebA in[Table4](https://arxiv.org/html/2605.22871#S5.T4)\. If the decoder of the semantic communication models is not available, the existing unlearning methods \(Retrain, GA, VBU, RFU\) would be infeasible to implement unlearning for the semantic communication systems\. They all need the training task information and access to the decoder as their unlearning methods and losses are highly related to the task\. Only our ManiF\-SMC is suitable for this complex but practical scenario, as ManiF\-SMC can achieve unlearning based solely on the learned manifold representation, which only needs to access the encoder\. Moreover, the ManiF\-SMC also achieves the best performance similar to the retraining method with the decoder\.
Table 4\.Results of unlearning semantic communication systems with only access to encoder on CelebA\. Only ManiF\-SMC is suitable for this scenario when the semantic decoder is not available\.MIA \(%\)RA \(%\)TA \(%\)Retrain \(Decoder Available\)60\.0096\.5996\.13Retrain \(Dec\. not Available\)–––GA \(Dec\. Available\)61\.0092\.2791\.02GA \(Dec\. not Avaliable\)–––VBU \(Dec\. Available\)56\.0096\.5296\.02VBU \(Dec\. not Available\)–––RFU \(Dec\. Available\)54\.0096\.4395\.89RFU \(Dec\. not Available\)–––SalUn \(Dec\. Available\)54\.5096\.5196\.27SalUn \(Dec\. not Available\)–––ManiF\-SMC \(Dec\. Available\)54\.5096\.3495\.93ManiF\-SMC \(Dec\. not Available\)54\.5096\.3495\.93
–: the unlearning method is not applicable\.
## 6\.Summary and Future Work
In this paper, we investigate machine unlearning from a novel perspective: the model learned manifold representations\. Motivated by our empirical observation, we first reformulate approximate unlearning as a representation\-space relocation problem:*push each erased sample away from its original representation while pulling it toward the centroid of its top\-kknearest retained samples\.*We then propose ManiF\-SMC, a label\-agnostic unlearning method that operates purely in representation space\. ManiF\-SMC includes a manifold contrastive forgetting \(ManiF\) method to implement unlearning with a margin\-based triplet loss and a self mode connectivity \(SMC\) method to adaptively calculate the margin\.
We outline several promising directions stemming from this work\. First, our experimental observation provides a new angle to reformulate a more robust approximate unlearning than previously gradient ascent\-based perspectives\. Second, our findings indicate that MMCRs can enhance existing unlearning approaches, offering useful design insights for future methods\. Last but not least, investigating unlearning solely in the learned representation space provides broader applicability to more complex scenarios, such as pre\-trained models or semantic communications, where a single encoder is shared across multiple downstream tasks\.
## References
- \(1\)
- Bertram et al\.\(2019\)Theo Bertram, Elie Bursztein, Stephanie Caro, Hubert Chao, Rutledge Chin Feman, Peter Fleischer, Albin Gustafsson, Jess Hemerly, Chris Hibbert, Luca Invernizzi, et al\.2019\.Five years of the right to be forgotten\. In*Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security*\. 959–972\.
- Bourtoule et al\.\(2021\)Lucas Bourtoule, Varun Chandrasekaran, Christopher A Choquette\-Choo, Hengrui Jia, Adelin Travers, Baiwu Zhang, David Lie, and Nicolas Papernot\. 2021\.Machine unlearning\. In*2021 IEEE Symposium on Security and Privacy \(SP\)*\. IEEE, 141–159\.
- Cao and Yang \(2015\)Yinzhi Cao and Junfeng Yang\. 2015\.Towards making systems forget with machine unlearning\. In*2015 IEEE Symposium on Security and Privacy*\. IEEE, 463–480\.
- Chen et al\.\(2023\)Min Chen, Weizhuo Gao, Gaoyang Liu, Kai Peng, and Chen Wang\. 2023\.Boundary unlearning: Rapid forgetting of deep networks via shifting the decision boundary\. In*Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition*\. 7766–7775\.
- Chen et al\.\(2021\)Min Chen, Zhikun Zhang, Tianhao Wang, Michael Backes, Mathias Humbert, and Yang Zhang\. 2021\.When machine unlearning jeopardizes privacy\. In*Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security*\. 896–911\.
- Chen et al\.\(2020\)Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton\. 2020\.A simple framework for contrastive learning of visual representations\. In*International conference on machine learning*\. PmLR, 1597–1607\.
- Chundawat et al\.\(2023\)Vikram S Chundawat, Ayush K Tarun, Murari Mandal, and Mohan Kankanhalli\. 2023\.Can bad teaching induce forgetting? unlearning in deep networks using an incompetent teacher\. In*Proceedings of the AAAI Conference on Artificial Intelligence*, Vol\. 37\. 7210–7217\.
- Chung et al\.\(2018\)SueYeon Chung, Daniel D Lee, and Haim Sompolinsky\. 2018\.Classification and geometry of general perceptual manifolds\.*Physical Review X*8, 3 \(2018\), 031003\.
- Deng \(2012\)Li Deng\. 2012\.The mnist database of handwritten digit images for machine learning research \[best of the web\]\.*IEEE signal processing magazine*29, 6 \(2012\), 141–142\.
- Ebrahimpour\-Boroojeny et al\.\(2025\)Ali Ebrahimpour\-Boroojeny, Hari Sundaram, and Varun Chandrasekaran\. 2025\.Not All Wrong is Bad: Using Adversarial Examples for Unlearning\. In*Forty\-second International Conference on Machine Learning*\.
- Fan et al\.\(2024\)Chongyu Fan, Jiancheng Liu, Yihua Zhang, Eric Wong, Dennis Wei, and Sijia Liu\. 2024\.SalUn: Empowering Machine Unlearning via Gradient\-based Weight Saliency in Both Image Classification and Generation\. In*The Twelfth International Conference on Learning Representations*\.
- Fu et al\.\(2022\)Shaopeng Fu, Fengxiang He, and Dacheng Tao\. 2022\.Knowledge Removal in Sampling\-based Bayesian Inference\. In*The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25\-29, 2022*\. OpenReview\.net\.
- Gardner \(1988\)Elizabeth Gardner\. 1988\.The space of interactions in neural network models\.*Journal of physics A: Mathematical and general*21, 1 \(1988\), 257\.
- Garipov et al\.\(2018\)Timur Garipov, Pavel Izmailov, Dmitrii Podoprikhin, Dmitry P Vetrov, and Andrew G Wilson\. 2018\.Loss surfaces, mode connectivity, and fast ensembling of dnns\.*Advances in neural information processing systems*31 \(2018\)\.
- Golatkar et al\.\(2020\)Aditya Golatkar, Alessandro Achille, and Stefano Soatto\. 2020\.Eternal sunshine of the spotless net: Selective forgetting in deep networks\. In*Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition*\. 9304–9312\.
- Graves et al\.\(2021\)Laura Graves, Vineel Nagisetty, and Vijay Ganesh\. 2021\.Amnesiac machine learning\. In*Proceedings of the AAAI Conference on Artificial Intelligence*, Vol\. 35\. 11516–11524\.
- Guo et al\.\(2020\)Chuan Guo, Tom Goldstein, Awni Y\. Hannun, and Laurens van der Maaten\. 2020\.Certified Data Removal from Machine Learning Models\. In*Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13\-18 July 2020, Virtual Event**\(Proceedings of Machine Learning Research, Vol\. 119\)*\. PMLR, 3832–3842\.
- Huang et al\.\(2022\)Danlan Huang, Feifei Gao, Xiaoming Tao, Qiyuan Du, and Jianhua Lu\. 2022\.Toward semantic communications: Deep learning\-based image semantic coding\.*IEEE Journal on Selected Areas in Communications*41, 1 \(2022\), 55–71\.
- Izzo et al\.\(2021\)Zachary Izzo, Mary Anne Smart, Kamalika Chaudhuri, and James Zou\. 2021\.Approximate data deletion from machine learning models\. In*International Conference on Artificial Intelligence and Statistics*\. PMLR, 2008–2016\.
- Kingma and Welling \(2014\)Diederik P Kingma and Max Welling\. 2014\.Auto\-Encoding Variational Bayes\.*stat*1050 \(2014\), 1\.
- Krizhevsky et al\.\(2009\)Alex Krizhevsky, Geoffrey Hinton, et al\.2009\.Learning multiple layers of features from tiny images\.\(2009\)\.
- Le and Yang \(2015\)Yann Le and Xuan Yang\. 2015\.Tiny imagenet visual recognition challenge\.*CS 231N*7, 7 \(2015\), 3\.
- Li et al\.\(2020\)Tian Li, Anit Kumar Sahu, Ameet Talwalkar, and Virginia Smith\. 2020\.Federated learning: Challenges, methods, and future directions\.*IEEE signal processing magazine*37, 3 \(2020\), 50–60\.
- Liu et al\.\(2018\)Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang\. 2018\.Large\-scale celebfaces attributes \(celeba\) dataset\.*Retrieved August*15, 2018 \(2018\), 11\.
- Mantelero \(2013\)Alessandro Mantelero\. 2013\.The EU Proposal for a General Data Protection Regulation and the roots of the ‘right to be forgotten’\.*Comput\. Law Secur\. Rev\.*29, 3 \(2013\), 229–235\.
- Neel et al\.\(2021\)Seth Neel, Aaron Roth, and Saeed Sharifi\-Malvajerdi\. 2021\.Descent\-to\-delete: Gradient\-based methods for machine unlearning\. In*Algorithmic Learning Theory*\. PMLR, 931–962\.
- Nguyen et al\.\(2020\)Quoc Phong Nguyen, Bryan Kian Hsiang Low, and Patrick Jaillet\. 2020\.Variational bayesian unlearning\.*Advances in Neural Information Processing Systems*33 \(2020\), 16025–16036\.
- Nguyen et al\.\(2022\)Quoc Phong Nguyen, Ryutaro Oikawa, Dinil Mon Divakaran, Mun Choon Chan, and Bryan Kian Hsiang Low\. 2022\.Markov Chain Monte Carlo\-Based Machine Unlearning: Unlearning What Needs to be Forgotten\. In*ASIA CCS ’22: ACM Asia Conference on Computer and Communications Security, Nagasaki, Japan, 30 May 2022 \- 3 June 2022*\. ACM, 351–363\.
- Schroff et al\.\(2015\)Florian Schroff, Dmitry Kalenichenko, and James Philbin\. 2015\.Facenet: A unified embedding for face recognition and clustering\. In*Proceedings of the IEEE conference on computer vision and pattern recognition*\. 815–823\.
- Sekhari et al\.\(2021\)Ayush Sekhari, Jayadev Acharya, Gautam Kamath, and Ananda Theertha Suresh\. 2021\.Remember what you want to forget: Algorithms for machine unlearning\.*Advances in Neural Information Processing Systems*34 \(2021\)\.
- Shah et al\.\(2025\)Vedant Shah, Frederik Träuble, Ashish Malik, Hugo Larochelle, Michael Curtis Mozer, Sanjeev Arora, Yoshua Bengio, and Anirudh Goyal\. 2025\.Low Compute Unlearning via Sparse Representations\.*Transactions on Machine Learning Research*\(2025\)\.
- Shen et al\.\(2024\)Shaofei Shen, Chenhao Zhang, Yawen Zhao, Alina Bialkowski, Weitong Tony Chen, and Miao Xu\. 2024\.Label\-Agnostic Forgetting: A Supervision\-Free Unlearning in Deep Models\. In*The Twelfth International Conference on Learning Representations*\.
- Song et al\.\(2019\)Liwei Song, Reza Shokri, and Prateek Mittal\. 2019\.Privacy risks of securing machine learning models against adversarial examples\. In*Proceedings of the 2019 ACM SIGSAC conference on computer and communications security*\. 241–257\.
- Thudi et al\.\(2022a\)Anvith Thudi, Gabriel Deza, Varun Chandrasekaran, and Nicolas Papernot\. 2022a\.Unrolling sgd: Understanding factors influencing machine unlearning\. In*2022 IEEE 7th European Symposium on Security and Privacy \(EuroS&P\)*\. IEEE, 303–319\.
- Thudi et al\.\(2022b\)Anvith Thudi, Hengrui Jia, Ilia Shumailov, and Nicolas Papernot\. 2022b\.On the necessity of auditable algorithmic definitions for machine unlearning\. In*31st USENIX Security Symposium \(USENIX Security 22\)*\. 4007–4022\.
- Tian et al\.\(2021\)Yuandong Tian, Xinlei Chen, and Surya Ganguli\. 2021\.Understanding self\-supervised learning dynamics without contrastive pairs\. In*International Conference on Machine Learning*\. PMLR, 10268–10278\.
- Tian et al\.\(2024\)Zhiyi Tian, Chenhan Zhang, Weiqi Wang, Hanna Bogucka, and Shui Yu\. 2024\.ROSE: A Receiver\-Oriented Semantic Communication Framework\.*IEEE Network*\(2024\)\.
- Träuble et al\.\(2023\)Frederik Träuble, Anirudh Goyal, Nasim Rahaman, Michael Curtis Mozer, Kenji Kawaguchi, Yoshua Bengio, and Bernhard Schölkopf\. 2023\.Discrete key\-value bottleneck\. In*International conference on machine learning*\. PMLR, 34431–34455\.
- Wang and Isola \(2020\)Tongzhou Wang and Phillip Isola\. 2020\.Understanding contrastive representation learning through alignment and uniformity on the hypersphere\. In*International conference on machine learning*\. PMLR, 9929–9939\.
- Wang et al\.\(2024\)Weiqi Wang, Chenhan Zhang, Zhiyi Tian, and Shui Yu\. 2024\.Machine Unlearning via Representation Forgetting With Parameter Self\-Sharing\.*IEEE Transactions on Information Forensics and Security*19 \(2024\), 1099–1111\.
- Warnecke et al\.\(2024\)Alexander Warnecke, Lukas Pirch, Christian Wressnegger, and Konrad Rieck\. 2024\.Machine unlearning of features and labels\.*31th Annual Network and Distributed System Security Symposium, NDSS 2024*\(2024\)\.
- Weinberger and Saul \(2009\)Kilian Q Weinberger and Lawrence K Saul\. 2009\.Distance metric learning for large margin nearest neighbor classification\.*Journal of machine learning research*10, 2 \(2009\)\.
- Wu et al\.\(2020\)Yinjun Wu, Edgar Dobriban, and Susan Davidson\. 2020\.Deltagrad: Rapid retraining of machine learning models\. In*International Conference on Machine Learning*\. PMLR, 10355–10366\.
- Yan et al\.\(2022\)Haonan Yan, Xiaoguang Li, Ziyao Guo, Hui Li, Fenghua Li, and Xiaodong Lin\. 2022\.ARCANE: An Efficient Architecture for Exact Machine Unlearning\. In*Proceedings of the Thirty\-First International Joint Conference on Artificial Intelligence, IJCAI 2022, Vienna, Austria, 23\-29 July 2022*, Luc De Raedt \(Ed\.\)\. ijcai\.org, 4006–4013\.
- Yeom et al\.\(2018\)Samuel Yeom, Irene Giacomelli, Matt Fredrikson, and Somesh Jha\. 2018\.Privacy risk in machine learning: Analyzing the connection to overfitting\. In*2018 IEEE 31st computer security foundations symposium \(CSF\)*\. IEEE, 268–282\.
- Yerxa et al\.\(2023\)Thomas Yerxa, Yilun Kuang, Eero Simoncelli, and SueYeon Chung\. 2023\.Learning efficient coding of natural images with maximum manifold capacity representations\.*Advances in Neural Information Processing Systems*36 \(2023\), 24103–24128\.
- Zhao et al\.\(2020\)Pu Zhao, Pin\-Yu Chen, Payel Das, Karthikeyan Natesan Ramamurthy, and Xue Lin\. 2020\.Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness\. In*8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26\-30, 2020*\. OpenReview\.net\.
## Appendix ATraining with MMCR for Generative Models


Figure 6\.An example of models learned representation with or without MMCRs on MNIST\. MMCRs support learning a separate representation, inspiring us to unlearn samples focusing only on the corresponding manifold representation\.[Figure6](https://arxiv.org/html/2605.22871#A1.F6)demonstrates the comparison of training a self\-supervise model with or without MMCRs\(Yerxa et al\.,[2023](https://arxiv.org/html/2605.22871#bib.bib47)\)on MNIST\. With MMCRs during model training, the representation will make every class’s points stay close together \(small intra\-class distance\) while different classes lie far apart \(large inter\-class distance\)\. The representation with these properties inspires us to investigate an efficient approximate unlearning method within only the manifold representation\.
Table 5\.Frequently used notations in ManiF\-SMC\.SymbolMeaningθo\\theta\_\{o\}Original model parameters \(trained onSS\)\.θu\\theta\_\{u\}Unlearned model parameters after forgettingSuS\_\{u\}\.θ~\\tilde\{\\theta\}Surrogate model sampled on the SMC path,θ~=ϕw\(t⋆\)\\tilde\{\\theta\}=\\phi\_\{w\}\(t^\{\\star\}\)\.SSOriginal training set\.SuS\_\{u\}Unlearning \(erased\) set\.SrS\_\{r\}Remaining \(retained\) set,Sr=S∖SuS\_\{r\}=S\\setminus S\_\{u\}\.xix\_\{i\}An erased target sample,xi∈Sux\_\{i\}\\in S\_\{u\}\.SkiS\_\{k\}^\{i\}Top\-kkmost similar retained samples forxix\_\{i\}, selected in the original representation space;Ski⊆SrS\_\{k\}^\{i\}\\subseteq S\_\{r\}\.SkS\_\{k\}Union of retained neighborhoods,Sk=∪xi∈SuSkiS\_\{k\}=\\cup\_\{x\_\{i\}\\in S\_\{u\}\}S\_\{k\}^\{i\}\.kkNumber of retained neighbors used for each erased sample\.zi,o\\texttt\{z\}\_\{i,o\}Original representation ofxix\_\{i\}:zi,o=fθo\(xi\)\\texttt\{z\}\_\{i,o\}=f\_\{\\theta\_\{o\}\}\(x\_\{i\}\)\.zi,u\\texttt\{z\}\_\{i,u\}Unlearned representation ofxix\_\{i\}:zi,u=fθu\(xi\)\\texttt\{z\}\_\{i,u\}=f\_\{\\theta\_\{u\}\}\(x\_\{i\}\)\.ci,o\\texttt\{c\}\_\{i,o\}Neighbor centroid under the original model \(computed onSkiS\_\{k\}^\{i\}usingfθof\_\{\\theta\_\{o\}\}\)\.ci,u\\texttt\{c\}\_\{i,u\}Neighbor centroid under the unlearned model:ci,u=1\|Ski\|∑xj∈Skifθu\(xj\)\\texttt\{c\}\_\{i,u\}=\\frac\{1\}\{\|S\_\{k\}^\{i\}\|\}\\sum\_\{x\_\{j\}\\in S\_\{k\}^\{i\}\}f\_\{\\theta\_\{u\}\}\(x\_\{j\}\)\.ci,θ~\\texttt\{c\}\_\{i,\\tilde\{\\theta\}\}Neighbor centroid estimated by the surrogate model:ci,θ~=1\|Ski\|∑xj∈Skifθ~\(xj\)\\texttt\{c\}\_\{i,\\tilde\{\\theta\}\}=\\frac\{1\}\{\|S\_\{k\}^\{i\}\|\}\\sum\_\{x\_\{j\}\\in S\_\{k\}^\{i\}\}f\_\{\\tilde\{\\theta\}\}\(x\_\{j\}\)\.wwLearnable control point of the quadratic Bézier path\.ttPath parameter, sampled from𝒰\[0,1\]\\mathcal\{U\}\[0,1\]when optimizingww\.t⋆t^\{\\star\}Fixed sampling position on the path \(e\.g\.,t⋆=0\.5t^\{\\star\}=0\.5\) used to formθ~\\tilde\{\\theta\}\.
## Appendix BNotations in ManiF\-SMC
We summarize the notations in[Table5](https://arxiv.org/html/2605.22871#A1.T5)\.
Input:Trained model
θo\\theta\_\{o\}; unlearning set
SuS\_\{u\}; retained neighbor sets
\{Ski\}xi∈Su\\\{S\_\{k\}^\{i\}\\\}\_\{x\_\{i\}\\in S\_\{u\}\}and
Sk=∪xi∈SuSkiS\_\{k\}=\\cup\_\{x\_\{i\}\\in S\_\{u\}\}S\_\{k\}^\{i\}; distance
dist\(⋅,⋅\)\\mathrm\{dist\}\(\\cdot,\\cdot\); control point
ww; learning rate
η\\eta; epochs
EE; fixed
t⋆t^\{\\star\}\(e\.g\.,
0\.50\.5\)\.
Output:Updated model parameters
θu\\theta\_\{u\}\.
//Initialization
θu←θo\\theta\_\{u\}\\leftarrow\\theta\_\{o\};
//start from original model
w←θw\\leftarrow\\theta;
//initialize Bézier control point
foreach*xi∈Sux\_\{i\}\\in S\_\{u\}*do
zi,o←fθo\(xi\)\\texttt\{z\}\_\{i,o\}\\leftarrow f\_\{\\theta\_\{o\}\}\(x\_\{i\}\);
//cache original representation
end foreach
for*e=1e=1toEEepochs*do
//\(A\) Learn SMC control pointwwon retained neighborhood data
foreach*mini\-batchBk⊂SkB\_\{k\}\\subset S\_\{k\}*do
sample
t∼𝒰\[0,1\]t\\sim\\mathcal\{U\}\[0,1\];
θt←ϕw\(t\)=\(1−t\)2θu\+2t\(1−t\)w\+t2θo\\theta\_\{t\}\\leftarrow\\phi\_\{w\}\(t\)=\(1\-t\)^\{2\}\\theta\_\{u\}\+2t\(1\-t\)w\+t^\{2\}\\theta\_\{o\};
ℒpath←ℒBk\(θt\)\\mathcal\{L\}\_\{\\mathrm\{path\}\}\\leftarrow\\mathcal\{L\}\_\{B\_\{k\}\}\(\\theta\_\{t\}\);
//retained loss \(same as training loss\)
w←w−η∇wℒpathw\\leftarrow w\-\\eta\\nabla\_\{w\}\\mathcal\{L\}\_\{\\mathrm\{path\}\};
end foreach
//\(B\) Pick a surrogate model on the low\-loss path
θ~←ϕw\(t⋆\)\\tilde\{\\theta\}\\leftarrow\\phi\_\{w\}\(t^\{\\star\}\);
//we pickt⋆=0\.5t^\{\\star\}=0\.5in experiments
//\(C\) ManiF update: compute centroid \+ adaptive margin underθ~\\tilde\{\\theta\}, updateθu\\theta\_\{u\}
foreach*mini\-batchBu⊂SuB\_\{u\}\\subset S\_\{u\}*do
foreach*xi∈Bux\_\{i\}\\in B\_\{u\}*do
ci,θ~←1\|Ski\|∑xj∈Skifθ~\(xj\)\\texttt\{c\}\_\{i,\\tilde\{\\theta\}\}\\leftarrow\\frac\{1\}\{\|S\_\{k\}^\{i\}\|\}\\sum\_\{x\_\{j\}\\in S\_\{k\}^\{i\}\}f\_\{\\tilde\{\\theta\}\}\(x\_\{j\}\);
αi←\[dist\(fθ~\(xi\),zi,o\)−dist\(fθ~\(xi\),ci,θ~\)\]\+\\alpha\_\{i\}\\leftarrow\\Big\[\\mathrm\{dist\}\\big\(f\_\{\\tilde\{\\theta\}\}\(x\_\{i\}\),\\texttt\{z\}\_\{i,o\}\\big\)\-\\mathrm\{dist\}\\big\(f\_\{\\tilde\{\\theta\}\}\(x\_\{i\}\),\\texttt\{c\}\_\{i,\\tilde\{\\theta\}\}\\big\)\\Big\]\_\{\+\};
ℓi←\[dist\(fθu\(xi\),ci,θ~\)−dist\(fθu\(xi\),zi,o\)\+αi\]\+\\ell\_\{i\}\\leftarrow\\Big\[\\mathrm\{dist\}\\big\(f\_\{\\theta\_\{u\}\}\(x\_\{i\}\),\\texttt\{c\}\_\{i,\\tilde\{\\theta\}\}\\big\)\-\\mathrm\{dist\}\\big\(f\_\{\\theta\_\{u\}\}\(x\_\{i\}\),\\texttt\{z\}\_\{i,o\}\\big\)\+\\alpha\_\{i\}\\Big\]\_\{\+\};
end foreach
ℒtriplet←∑xi∈Buℓi\\mathcal\{L\}\_\{\\mathrm\{triplet\}\}\\leftarrow\\sum\_\{x\_\{i\}\\in B\_\{u\}\}\\ell\_\{i\};
θu←θu−η∇θuℒtriplet\\theta\_\{u\}\\leftarrow\\theta\_\{u\}\-\\eta\\nabla\_\{\\theta\_\{u\}\}\\mathcal\{L\}\_\{\\mathrm\{triplet\}\};
end foreach
end for
1exreturn
θu\\theta\_\{u\};
Algorithm 1ManiF\-SMC
## Appendix CImplementation Details of ManiF\-SMC and Discussion
Overview\.Algorithm[1](https://arxiv.org/html/2605.22871#algorithm1)summarizes ManiF\-SMC, which alternates between \(i\) constructing a retained\-geometry surrogate via self mode connectivity \(SMC\) and \(ii\) updating the unlearning model parameters via the manifold contrastive forgetting \(ManiF\) objective\. The key design is that both the retained\-neighbor centroid and the triplet margin are computed under a surrogate model sampled on a low\-loss path trained only on retained neighborhood data, avoiding the need for retraining\-from\-scratch onSrS\_\{r\}\.
#### Inputs and preprocessing\.
The algorithm takes the trained modelθo\\theta\_\{o\}, the erased \(unlearning\) setSuS\_\{u\}, and for eachxi∈Sux\_\{i\}\\in S\_\{u\}a retained neighborhoodSkiS\_\{k\}^\{i\}\(top\-kkmost similar samples from the remaining data\), withSk=∪xi∈SuSkiS\_\{k\}=\\cup\_\{x\_\{i\}\\in S\_\{u\}\}S\_\{k\}^\{i\}\. In practice,SkiS\_\{k\}^\{i\}can be built once using the original representation space \(e\.g\., nearest neighbors ofzi,o=fθo\(xi\)\\texttt\{z\}\_\{i,o\}=f\_\{\\theta\_\{o\}\}\(x\_\{i\}\)\) and then kept fixed during unlearning to reduce overhead\. We also cachezi,o\\texttt\{z\}\_\{i,o\}for all erased samples \(Lines 3–4\), since it is used as the “negative” reference in the triplet constraint and does not change throughout the process\.
We should notice that this algorithm does not rely on the label information\. Hence, our inputs do not include theyiy\_\{i\}\.
#### Step \(A\): learning the SMC path control pointww\.
Given endpointsθu\\theta\_\{u\}\(current unlearning model\) andθo\\theta\_\{o\}\(original model\), we parameterize a quadratic Bézier curveϕw\(t\)=\(1−t\)2θu\+2t\(1−t\)w\+t2θo\\phi\_\{w\}\(t\)=\(1\-t\)^\{2\}\\theta\_\{u\}\+2t\(1\-t\)w\+t^\{2\}\\theta\_\{o\}\. In Step \(A\), we update the control pointwwby minimizing the retained loss along the path \(Eq\. \([8](https://arxiv.org/html/2605.22871#S4.E8)\)\) using only mini\-batches fromSkS\_\{k\}\(Lines 5–10\)\. This yields a low\-loss connection in parameter space that approximates the retained\-data geometry without requiring full retraining onSrS\_\{r\}\. Importantly, the gradient is taken*w\.r\.t\.wwonly*;θu\\theta\_\{u\}is held fixed during this inner update so that the path training remains lightweight\.
#### Step \(B\): sampling a surrogate modelθ~\\tilde\{\\theta\}\.
After optimizingwwon the retained neighborhood, we select a surrogate modelθ~=ϕw\(t⋆\)\\tilde\{\\theta\}=\\phi\_\{w\}\(t^\{\\star\}\)\(Line 11\)\. We use a fixedt⋆t^\{\\star\}\(e\.g\.,0\.50\.5\) for simplicity and stability\. Empirically, mid\-path sampling often provides a good trade\-off between staying close to the retained geometry and not overfitting to either endpoint, while preserving the “low\-loss” property by construction\.
#### Step \(C\): ManiF update with centroid and adaptive margin\.
For each erased samplexix\_\{i\}, we compute the retained\-neighbor centroid under the surrogate model,ci,θ~=1\|Ski\|∑xj∈Skifθ~\(xj\)\\texttt\{c\}\_\{i,\\tilde\{\\theta\}\}=\\frac\{1\}\{\|S\_\{k\}^\{i\}\|\}\\sum\_\{x\_\{j\}\\in S\_\{k\}^\{i\}\}f\_\{\\tilde\{\\theta\}\}\(x\_\{j\}\), and an adaptive marginαi=\[dist\(fθ~\(xi\),zi,o\)−dist\(fθ~\(xi\),ci,θ~\)\]\+\\alpha\_\{i\}=\\big\[\\mathrm\{dist\}\(f\_\{\\tilde\{\\theta\}\}\(x\_\{i\}\),\\texttt\{z\}\_\{i,o\}\)\-\\mathrm\{dist\}\(f\_\{\\tilde\{\\theta\}\}\(x\_\{i\}\),\\texttt\{c\}\_\{i,\\tilde\{\\theta\}\}\)\\big\]\_\{\+\}\(Lines 14–15\), which corresponds to Eq\. \([10](https://arxiv.org/html/2605.22871#S4.E10)\)\. We then updateθu\\theta\_\{u\}by minimizing the triplet hinge objective \(Eq\. \([11](https://arxiv.org/html/2605.22871#S4.E11)\)\) on mini\-batches fromSuS\_\{u\}\(Lines 17–18\)\. Conceptually, this enforces that the updated representationfθu\(xi\)f\_\{\\theta\_\{u\}\}\(x\_\{i\}\)moves at leastαi\\alpha\_\{i\}closer to the retained\-neighbor centroid than to its original representationzi,o\\texttt\{z\}\_\{i,o\}, implementing the push–pull unlearning behavior in a single margin\-ranking constraint\.
#### Why the surrogate is necessary\.
A core difficulty in representation\-based unlearning is that the desired centroidci,u\\texttt\{c\}\_\{i,u\}and a suitable marginα\\alphaare naturally defined under a retrained model onSrS\_\{r\}, which is unavailable in efficient approximate unlearning\. ManiF\-SMC addresses this by using SMC to constructθ~\\tilde\{\\theta\}that is trained only on retained neighborhood data\. Sinceθ~\\tilde\{\\theta\}is sampled from a low\-loss connecting path, it provides a stable estimate of retained\-neighbor representations \(supported by Proposition[1](https://arxiv.org/html/2605.22871#S4.Thmtheorem1)\) and yields a data\-dependent marginαi\\alpha\_\{i\}that adapts to local geometry rather than using a global constant\.
#### Distance function choices\.
We use Euclidean distance by default fordist\(⋅,⋅\)\\mathrm\{dist\}\(\\cdot,\\cdot\)in Algorithm[1](https://arxiv.org/html/2605.22871#algorithm1), which is standard in metric learning\. Alternative metrics \(e\.g\., cosine distance andL2L\_\{2\}norm\) can be plugged in without changing the algorithm structure\. We provide some evaluations about distance metrics in[SectionG\.3](https://arxiv.org/html/2605.22871#A7.SS3)\.
#### Other hyperparameters\.
Neighborhood sizekkand samplingttof SMC will also influence the performance of[Algorithm1](https://arxiv.org/html/2605.22871#algorithm1)\. We also provide the additional evaluations in[SectionsG\.4](https://arxiv.org/html/2605.22871#A7.SS4)and[G\.5](https://arxiv.org/html/2605.22871#A7.SS5)\.
#### Complexity discussion\.
Compared with retraining onSrS\_\{r\}, ManiF\-SMC is efficient because it \(i\) computes centroids only over the retained neighborhoodsSkiS\_\{k\}^\{i\}rather than the full retained set, and \(ii\) optimizes the SMC path using onlySkS\_\{k\}\. The dominant additional cost beyond the ManiF update is the forward/backward pass for path training ofwwonSkS\_\{k\}and the centroid computation for eachxix\_\{i\}within a mini\-batch\. Both are controllable viakkand the number of SMC optimization steps\.
Table 6\.Dataset statistics\.DatasetFeature Dimension\#\. Classes\#\. SamplesMNIST\(Deng,[2012](https://arxiv.org/html/2605.22871#bib.bib10)\)28×28×11070,000CIFAR10\(Krizhevsky et al\.,[2009](https://arxiv.org/html/2605.22871#bib.bib22)\)32×32×31060,000CelebA\(Liu et al\.,[2018](https://arxiv.org/html/2605.22871#bib.bib25)\)178×218×32 \(Gender\)202,599Tiny\-ImageNet\(Le and Yang,[2015](https://arxiv.org/html/2605.22871#bib.bib23)\)64×64×3200110,000
## Appendix DDatasets
The statistics of all datasets used in our experiments are listed in[Table6](https://arxiv.org/html/2605.22871#A3.T6)\. Both MNIST and CIFAR10 are used to train 10\-class classification models\. The experiment on CelebA is to identify the gender attributes of the face images\. The task is a binary classification problem, different from the ones on MNIST and CIFAR10\. The task of Tiny\-ImageNet is a 200\-class classification\. These datasets offer a range of objective categories with varying levels of learning complexity\. We also introduce them as below\.
- •MNIST\(Deng,[2012](https://arxiv.org/html/2605.22871#bib.bib10)\)\.MNIST contains 60,000 handwritten digit images for the training and 10,000 handwritten digit images for the testing\. All these black and white digits are size normalized, and centered in a fixed\-size image with 28 × 28 pixels\.
- •CIFAR10\(Krizhevsky et al\.,[2009](https://arxiv.org/html/2605.22871#bib.bib22)\)\.CIFAR10 dataset consists of 60,000 32x32 colour images in 10 classes, with 6,000 images per class\. There are 50,000 training images and 10,000 test images\.
- •CelebA\(Liu et al\.,[2018](https://arxiv.org/html/2605.22871#bib.bib25)\)\.CelebA is a large\-scale face attributes dataset with more than 200,000 celebrity images, each with 40 attribute annotations\.
- •Tiny\-ImageNet\(Le and Yang,[2015](https://arxiv.org/html/2605.22871#bib.bib23)\)\.The Tiny\-ImageNet image size is 64x64 pixels and the dataset sizes are 100,000 training images across 200 classes; 10,000 test images\.
On CelebA

\(a\)
\(b\)
\(c\)
Figure 7\.Additional Evaluations of impact about different𝑈𝑆𝑆\\it USSon various unlearning methods on CelebA\.
## Appendix EEvaluation Metrics
We provide the details of metrics as follows\.
- •Membership inference attack \(MIA\)\.We employ the membership inference attack \(MIA\)\(Song et al\.,[2019](https://arxiv.org/html/2605.22871#bib.bib34); Yeom et al\.,[2018](https://arxiv.org/html/2605.22871#bib.bib46)\)to gauge the effectiveness of unlearning\. Concretely, we apply the MIA predictor to the unlearned modelfθuf\_\{\\theta\_\{u\}\}on the datasetSuS\_\{u\}\. The resulting success rate of MIA is calculated by how many instances inSuS\_\{u\}are correctly identified as non\-training samples forfθuf\_\{\\theta\_\{u\}\}\. A higher MIA success rate suggests thatfθuf\_\{\\theta\_\{u\}\}retains less information aboutSuS\_\{u\}\.
- •Remaining accuracy \(RA\)\.This refers to the accuracy offθuf\_\{\\theta\_\{u\}\}on the remaining datasetSrS\_\{r\}, which reflects the fidelity of machine unlearning\. The training data information should be preserved from original model to the unlearned model\.
- •Testing accuracy \(TA\)\.We report TA as an indicator of how well the unlearned modelfθuf\_\{\\theta\_\{u\}\}generalizes when evaluated on the test dataset\.
- •Remaining mean\-squared error \(R\-MSE\)\.This refers to the reconstruction mean\-squared error of generative models on the remaining datasetSrS\_\{r\}, which reflects the fidelity of machine unlearning for generative models\.
- •Testing mean\-squared error \(T\-MSE\)\.We report T\-MSE as an indicator of how well the unlearned generative model performs when evaluated on the test dataset\.
- •Running Time \(RT\)\.This metric reflects the computational efficiency of a machine unlearning method\. We obtain RT by tracking the per\-batch training duration and multiplying by the total number of training epochs\.
## Appendix FMachine Unlearning Benchmarks
Machine unlearning seeks to remove or negate the influence of a subset of training data\{xi\}i∈Su\\\{x\_\{i\}\\\}\_\{i\\in S\_\{u\}\}on a trained modelfθf\_\{\\theta\}\(Bourtoule et al\.,[2021](https://arxiv.org/html/2605.22871#bib.bib3); Warnecke et al\.,[2024](https://arxiv.org/html/2605.22871#bib.bib42)\)\. Although exact unlearning can be achieved through retraining a model using the remaining dataset, the associated computational costs have driven the more efficient solutions, the approximate unlearning methods\(Shen et al\.,[2024](https://arxiv.org/html/2605.22871#bib.bib33); Izzo et al\.,[2021](https://arxiv.org/html/2605.22871#bib.bib20)\)\. Classic approximate unlearning approaches include:
- •Gradient Ascent \(GA\)\(Graves et al\.,[2021](https://arxiv.org/html/2605.22871#bib.bib17); Thudi et al\.,[2022a](https://arxiv.org/html/2605.22871#bib.bib35)\): GA reverses the model training on the erased samplesSuS\_\{u\}by adding the corresponding gradients back toθo\\theta\_\{o\}, i\.e\., movingθo\\theta\_\{o\}in the direction of increasing loss for data to be unlearned, whereθo\\theta\_\{o\}is the original trained model parameters\.
- •Variational Bayesian Unlearning \(VBU\)\(Nguyen et al\.,[2020](https://arxiv.org/html/2605.22871#bib.bib28),[2022](https://arxiv.org/html/2605.22871#bib.bib29)\): VBU is an approximate unlearning method based on variational Bayesian inference\. In practice, a middle layer of original neural networks is used as the Bayesian layer to calculate the VBU loss according to\(Nguyen et al\.,[2020](https://arxiv.org/html/2605.22871#bib.bib28)\)to achieve unlearning\.
- •Representation Forgetting Unlearning\(RFU\)\(Wang et al\.,[2024](https://arxiv.org/html/2605.22871#bib.bib41)\): RFU tries to unlearn a bottleneck representation by minimizing the mutual information between the representation and the erased samples, which provides an unlearning solution from an information theory perspective\.
- •Saliency Unlearning\(SalUn\)\(Fan et al\.,[2024](https://arxiv.org/html/2605.22871#bib.bib12)\): SalUn introduces “weight saliency” to remove the influence of samples and classes for unlearning a model, improving effectiveness and efficiency\.
From the manifold representation viewpoint, unlearning a sample\(xi,yi\)∈Su\(x\_\{i\},y\_\{i\}\)\\in S\_\{u\}can be achieved by pushing its representationzi\\texttt\{z\}\_\{i\}away from the center of its learned local manifold representation, for example, the class\-manifold centroidcyi\\texttt\{c\}\_\{y\_\{i\}\}\. One simple operation proxy is to maximize the distance‖zi−cyi‖\\\|\\texttt\{z\}\_\{i\}\-\\texttt\{c\}\_\{y\_\{i\}\}\\\|, ensuring that the unlearning sample no longer “lives” near its old class manifold centroid\.
## Appendix GAdditional Evaluations
### G\.1\.Additional Efficiency Evaluations aboutUSS
We put the additional unlearning effectiveness evaluation on CelebA in[Figure7](https://arxiv.org/html/2605.22871#A4.F7), which shows the data removal effect and model utility preservation of ManiF\-SMC\.
We put the additional efficiency evaluation results on CelebA, in[Figure8](https://arxiv.org/html/2605.22871#A7.F8)\. The running\-time curves show a clear efficiency hierarchy: the exact Retrain baseline is always the slowest—often by two to three orders of magnitude—while the approximate unlearning methods \(VBU, ManiF, and SalUn\) finish far sooner\. Among the approximations, VBU is consistently the fastest, SalUn incurs the highest overhead, and ManiF\-SMC sits in between, mirroring the computational complexity built into their respective update rules\.
\(a\)On CelebAFigure 8\.Efficiency of Running time\. Although the RT of approximate unlearning methods \(VBU, ManiF\-SMC, and SalUn\) increases as theUSSincreases, it is still much more efficient than retraining\.On MNIST

\(a\)
\(b\)
\(c\)
On CIFAR10

\(d\)
\(e\)
\(f\)
Figure 9\.Additional evaluations for ManiF\-SMC with fine\-tuning utilizing the remaining dataset\. ManiF\-SMC with fine\-tuning using the remaining dataset can effectively mitigate the model utility \(RA and TA\) degradation during unlearning, meanwhile the unlearning effectiveness \(MIA\) has slight drops compared with ManiF\-SMC without fine\-tuning\.
### G\.2\.Limitations and Additional Evaluation with Fine Tuning
Limitations of ManiF\-SMC\.Since ManiF\-SMC unlearns models only using the model manifold representation without the data task information, ManiF\-SMC suffers greater utility loss compared with SalUn and other approximate unlearning methods that utilize the remaining data to fine\-tune the unlearned model, especially when the unlearning sample size is large\. Mitigating this degradation is therefore an important avenue for future research\. In this work, we introduce a simple fine\-tuning step on the remaining dataset and assess its effect on model performance\.
Setup\.We additionally test the improvement of fine tuning for ManiF\-SMC on MNIST and CIFAR10\. TheUSSis set from 200 to 1200, where 1200 is around2%2\\%training data on MNIST and CIFAR10\. Our other settings are the same as the previous test for ManiF\-SMC\. The corresponding results are presented in[Figure9](https://arxiv.org/html/2605.22871#A7.F9)\.
Evaluation Results\.In[Figure9](https://arxiv.org/html/2605.22871#A7.F9), we observed that ManiF\-SMC with fine\-tuning using the remaining dataset can effectively mitigate the model utility \(RA and TA\) degradation during unlearning, meanwhile the unlearning effectiveness \(MIA\) has slight drops compared with ManiF\-SMC without fine\-tuning\. We should notice that the original ManiF\-SMC unlearns only based on the model learned manifold representation, not needing the task information\. Adding the fine\-tuning on the remaining dataset will need label information to calculate the original learning loss\. It improves the model utility preservation but makes the ManiF\-SMC similar to the existing unlearning methods, having not cut the reliance on the task information\.
### G\.3\.Ablation Study: How do Distance Metrics Influence the Performance of ManiF\-SMC?
Setup\.As we mentioned above, different distance metrics may influence the unlearning methods\. Therefore, we conducted the experiments using other metrics, the cosine similarity \(the Normalized Temperature\-Scaled Cross Entropy \(NT\-Xent\) used in SimCLR\(Chen et al\.,[2020](https://arxiv.org/html/2605.22871#bib.bib7)\)\)\. For a better adaption of cosine similarity \(NT\-Xent\), we also conducted Cosine Similarity \(NT\-Xent\) with Multi\-positive samples \(add an additional one\), because our method is for unlearning, the negative samples are limited\.
Results\.In[Table7](https://arxiv.org/html/2605.22871#A7.T7), the distance using cosine similarity achieves similar model utility preservation as theL2L\_\{2\}\-norm\. However, the cosine similarity achieves a lower MIA value for unlearning effect thanL2L\_\{2\}\-norm\. We infer when using similarity as the triplet distance, the similarity has a higher preference to optimize the \(anchor, positive\) pair, making unlearning more difficult than theL2L\_\{2\}\-norm\. For Cosine Similarity \(NT\-Xent\) with Multi\-positive samples, it does increase the unlearning model utility but further reduces the unlearning effect, the MIA\. Moreover, we conducted experiments with the label information for fine tuning:L2L\_\{2\}\-norm \(positive with label\)\. It increases the unlearned model utility too and shows the potential of fine tuning for ManiF\-SMC\. However, the biggest advantage of our method is still not relying on the task and corresponding gradient ascent\.
Table 7\.Evaluation of Different Distance Metrics for ManiF\-SMC on MNIST\.Distance MetricMIA \(%\)RA \(%\)TA \(%\)RT \(second\)L2L\_\{2\}norm62\.0099\.5999\.151\.482L2L\_\{2\}norm \(with label\)58\.0099\.6299\.201\.598Cosine Similarity \(NT\-Xent\)60\.0099\.3899\.042\.082Cosine Similarity \(NT\-Xent\)with Multi\-positive54\.0099\.5599\.152\.703
Table 8\.Additional evaluations of the value of chosenkkremaining similar samples on MNIST, CIFAR10, and Tiny\-ImageNet\.kkValueOn MNISTOn CIFAR10On Tiny\-ImageNetOn CelebAMIA \(%\)RA \(%\)TA \(%\)MIA \(%\)RA \(%\)TA \(%\)MIA \(%\)RA \(%\)TA \(%\)MIA \(%\)RA \(%\)TA \(%\)562\.0099\.5999\.1559\.0099\.0481\.7554\.5081\.1956\.8854\.5096\.3495\.93661\.5099\.5899\.1559\.5099\.0481\.7354\.5080\.1056\.0655\.0096\.3495\.93761\.0099\.5698\.8459\.5099\.0481\.7354\.5080\.1056\.0755\.0096\.3595\.94862\.0099\.5899\.1559\.5099\.0381\.7554\.5081\.2056\.8954\.0096\.4796\.17961\.5099\.5899\.1559\.5099\.0481\.7554\.5081\.2056\.9154\.0096\.4796\.171061\.5099\.5899\.1558\.5099\.0581\.7554\.5081\.1956\.9053\.0096\.4896\.19
### G\.4\.Additional Study: Influence ofkkChosen Remaining Similar Positive Samples
Setup\.In ManiF\-SMC, we choose the top\-k remaining most similar samples as the positive setSkS\_\{k\}, and the anchor is also calculated by the center of the top\-k samples\. Hence, we test how different top\-k values influence our ManiF\-SMC method\. We randomly select 200 samples for unlearning, and the experimental setting is the same as above\. We present the results on MNIST, CIFAR10, TinyImageNet, and CelebA in[Table8](https://arxiv.org/html/2605.22871#A7.T8)\.
Results\.Intuitively, more similar samples in the remaining dataset chosen will increase the model utility preservation during unlearning, because the changes of erasing one sample on the class\-centroid can be expressed by‖xu−c‖/k\\\|x\_\{u\}\-\\texttt\{c\}\\\|/k\. The results on CelebA in[Table8](https://arxiv.org/html/2605.22871#A7.T8)clearly confirm this, higher RA and TA whenkkvalue increases\. RA increases from 96\.34% \(k=5k=5\) to 96\.48% \(k=10k=10\), and TA climbs from 95\.93% to 96\.19%\. However, the unlearning effectiveness drops at the same time whenkkincreases, from 54\.50% \(k=5k=5\) to 53\.00% \(k=10k=10\)\. Hence, we can claim that a largerkkassists to maintain the model utility but mitigates the unlearning effectiveness, which is also proved by other experiments on other datasets in[Table8](https://arxiv.org/html/2605.22871#A7.T8)\.
Table 9\.Evaluation of checkpoints sampled on the connectivity curve on MNIST\.ttMIA \(%\)RA \(%\)TA \(%\)0\.158\.5099\.6299\.220\.361\.0099\.6399\.210\.562\.0099\.5999\.150\.758\.5099\.5999\.200\.958\.0099\.6299\.21
### G\.5\.Ablation Study: How doesttof SMC Influence ManiF\-SMC?
In[Section4\.2](https://arxiv.org/html/2605.22871#S4.SS2),[Eqs\.7](https://arxiv.org/html/2605.22871#S4.E7)and[8](https://arxiv.org/html/2605.22871#S4.E8)are applied to train the mode connectivity model to generate the local manifold representation for the adaptive margin calculation\. We can train a connectivity curve, such as,t∈\[0,1\]t\\in\[0,1\]with step equal to 0\.1, i\.e\., the curve with checkpoints in\{0,0\.1,0\.2,0\.3,0\.4,0\.5,0\.6,0\.7,0\.8,0\.9,1\}\\\{0,0\.1,0\.2,0\.3,0\.4,0\.5,0\.6,0\.7,0\.8,0\.9,1\\\}\. In our experiments, we use the checkpoint oft=0\.5t=0\.5, which is best for the unlearning effect\. We also provide the experiments of the curve in the following[Table9](https://arxiv.org/html/2605.22871#A7.T9)to demonstrate the results\.
Table 10\.Evaluation of different learning models on MNIST \(before and after unlearning\)\.MethodsMIA \(%\)RA \(%\)TA \(%\)Before unl\.After unl\.BeforeAfterBeforeAfterInformation Bottleneck \(IB\)51\.9961\.0099\.4999\.4798\.9698\.91InfoNCE50\.5053\.5099\.3999\.4098\.9198\.83
Table 11\.Evaluation of unlearning for ViT on MNIST \(before and after unlearning\)\.ModelMIA \(%\)MSE on remaining setMSE on test setBefore unl\.After unl\.BeforeAfterBeforeAfterViT62\.0067\.000\.04390\.04430\.04420\.0445
### G\.6\.Application: Scalability to Unlearn Other Models
ManiF\-SMC unlearning method is also transferable to other representation learning frameworks\. We provide the experimental results for InforNCE \(contrastive learning\) and information bottleneck \(IB\) \(representation learning\) in[Table10](https://arxiv.org/html/2605.22871#A7.T10)\.
We also conducted new experiments for Vision Transformer \(ViT\) on MNIST for the image generative task\. Results are provided in[Table11](https://arxiv.org/html/2605.22871#A7.T11), indicating the unlearning effectiveness of our method\.Similar Articles
Fast Unlearning at Scale via Margin Self-Correction
Introduces MASC (Margin Self-Correction), an efficient unlearning method for LLMs that uses an online stopping rule to achieve competitive forget–retain trade-offs at reduced computational cost, validated on TOFU and MUSE benchmarks.
Robust LLM Unlearning Against Relearning Attacks: The Minor Components in Representations Matter
This paper introduces Minor Component Unlearning (MCU), a novel approach to LLM unlearning that targets minor components in representations to resist relearning attacks. It addresses the vulnerability of existing methods by focusing on robust directions within the model's spectral structure.
Forgetting That Sticks: Quantization-Permanent Unlearning via Circuit Attribution
This paper identifies a fundamental sparsity-permanence tradeoff where quantization reverses machine unlearning, and proposes MANSU, a method combining causal circuit attribution and null-space projection to achieve quantization-permanent forgetting.
Causal Unlearning in Collaborative Optimization: Exact and Approximate Influence Reversal under Adversarial Contributions
Introduces HF-KCU, a method for efficient machine unlearning in federated learning that uses Krylov subspace approximations to remove a client's contribution, achieving significant speedup over retraining while preserving model accuracy and providing robustness against adversarial perturbations.
Lost or Hidden? A Concept-Level Forgetting in Supervised Continual Learning
This paper introduces a diagnostic framework using Sparse Autoencoders to analyze concept-level forgetting in continual learning, finding that much forgetting is due to representational inaccessibility rather than erasure.