Selective Capability Unlearning in End-to-End Spoken Language Understanding

arXiv cs.CL Papers

Summary

Proposes BindingSubspace (BSU), a representation-level framework that isolates and attenuates intent-conditioned directions in end-to-end spoken language understanding models to prevent capability persistence, where suppressing an intent still allows slot generation under forced prefixes. The method reduces forced-prefix recoverability while preserving retained performance on SLU benchmarks.

arXiv:2606.24063v1 Announce Type: new Abstract: Modern spoken language understanding (SLU) systems are increasingly deployed in real-world settings, where specific functionalities may need to be removed due to policy or safety constraints. In SLU, a functionality corresponds to an intent and its associated slot-generation behavior. However, in autoregressive models, suppressing a target intent does not eliminate the conditional mapping that generates slots conditioned on that intent. When the intent prefix is externally supplied, the model can reconstruct the original intent-slot structure. We identify this structural failure as \textbf{\emph{capability persistence}}. We propose \textit{\underline{B}inding \underline{S}ubspace (BSU)}, a representation-level framework that isolates and attenuates intent-conditioned directions underlying this mapping. Across SLU benchmarks, BSU substantially reduces forced-prefix recoverability while preserving retained performance.
Original Article
View Cached Full Text

Cached at: 06/24/26, 07:44 AM

# Selective Capability Unlearning in End-to-End Spoken Language Understanding
Source: [https://arxiv.org/html/2606.24063](https://arxiv.org/html/2606.24063)
Singh Kurmi

###### Abstract

Modern spoken language understanding \(SLU\) systems are increasingly deployed in real\-world settings, where specific functionalities may need to be removed due to policy or safety constraints\. In SLU, a functionality corresponds to an intent and its associated slot\-generation behavior\. However, in autoregressive models, suppressing a target intent does not eliminate the conditional mapping that generates slots conditioned on that intent\. When the intent prefix is externally supplied, the model can reconstruct the original intent\-slot structure\. We identify this structural failure as*capability persistence*\. We proposeBindingSubspace \(BSU\), a representation\-level framework that isolates and attenuates intent\-conditioned directions underlying this mapping\. Across SLU benchmarks, BSU substantially reduces forced\-prefix recoverability while preserving retained performance\.

###### keywords:

Machine Unlearning, Spoken Language Understanding, Speech Recognition\.

## 1Introduction

Spoken language understanding \(SLU\) constitutes a core component of conversational systems\. It enables devices like voice assistants and spoken interfaces to extract structured semantic information directly from speech\[bastianelli2020slurp,wang2021fine,arora2022espnet,seo2022integration\]\. Modern end\-to\-end SLU models\[huang2023leveraging,sharma2021leveraging\]directly map acoustic input to semantic outputs and are widely adapted in diverse applications, including virtual assistants, customer support systems, in\-vehicle voice control, and domain\-specific voice agents\[ma2024speech,mehta2020recent\]\.

As these systems are increasingly deployed in real\-world and regulated settings, there is a growing need to adjust model behavior post\-training, including SLU frameworks\[koudounas2025alexa\]\. In practice, certain functionalities may need to be disabled due to policy changes, safety considerations, or regulatory requirements\[voigt2017eu,goldman2020introduction,xu2024machine\]\. For example, a voice assistant may need to deactivate financial transaction capabilities in specific regions or restrict health\-related guidance under updated compliance rules\. Retraining these models from scratch in such cases is often costly and impractical, motivating the need for targeted post\-deployment methods that can selectively remove undesired behaviors while preserving other functionalities\.

However, in an autoregressive SLU, functionalities are not merely intent labels\. They correspond to an intent together with its associated slot\-generation behavior\. Semantic generation is inherently conditional, as the decoder first predicts an intent token, and subsequent slot tokens are generated conditioned on that prefix\. Thus, the behavior associated with an intent is governed by the conditional mapping from acoustic input and intent prefix to slot values\. In this work, we define the removable functionality as precisely this intent\-conditioned conditional mapping\. Upon a request to remove functionality, existing methods typically suppress the marginal probability of a target intent\. However, this does not necessarily modify the conditional mapping responsible for slot generation\. Consequently, under a forced intent prefix, the model can still reconstruct the corresponding intent\-slot structure\. We refer to this phenomenon as capability persistence\. As illustrated in Figure[1](https://arxiv.org/html/2606.24063#S1.F1), although intent prediction is suppressed under standard decoding, providing the intent as a prefix enables recovery of the associated slot structure\. This limitation arises from the conditional generative structure of autoregressive SLU models, which classification\-based unlearning methods do not explicitly address\. To eliminate the conditional mapping itself, we propose a two\-stage framework\.

![Refer to caption](https://arxiv.org/html/2606.24063v1/Figures/1122.jpg)Figure 1:Capability persistence under forced\-prefix decoding\.Existing methods suppress marginal intent prediction but preserve conditional intent–slot binding, enabling reconstruction under forced prefix\. BSU removes this dependency\.\\footnotesize\\bfseries1⃝Binding Subspace Identification\.For a target intentIFI\_\{F\}, we identify representation directions that capture its slot\-generation behavior\. We extract decoder hidden states at slot positions for forget and retain data and compute layer\-wise covariances\. By contrasting these covariances, we extract the eigen\-directions with the largest positive variance excess underIFI\_\{F\}, forming a compact subspace that captures intent\-slot dependency\.

\\footnotesize\\bfseries2⃝Subspace\-Guided Capability Attenuation\.We fine\-tune the model while reducing sensitivity along the identified subspace\. Specifically, we apply gradient\-based regularization to attenuate dependence on these directions, thereby weakening the conditional mapping while preserving general performance\.

Our key contributions are outlined as follows:\(i\)We formalize selective capability unlearning in SLU by defining intent capability as the conditional mapping and identifying*capability persistence*as recoverable slot generation under forced\-prefix decoding\.\(ii\)We proposeBinding Subspace Unlearning\(BSU\)111Annotations and code will be made publicly available\., a representation\-level framework that localizes intent\-conditioned binding directions via covariance contrast and attenuates them through subspace\-guided gradient regularization\.\(iii\)We introduce a recoverability\-based evaluation protocol to measure residual conditional behavior beyond intent accuracy\.\(iv\)BSU reduces conditional slot recoverability, with an average drop of∼\\sim60% in BRR@10 and∼\\sim56% in semantic similarity, while preserving retained\-intent performance without inference\-time overhead\.

## 2Problem Formulation

We consider an end\-to\-end spoken language understanding \(SLU\) model that maps an input speech signalx∈𝒳x\\in\\mathcal\{X\}to a structured semantic frame represented as a token sequencey=\(y1,…,yT\)y=\(y\_\{1\},\\dots,y\_\{T\}\)\. The sequence follows a fixed format in which the initial tokens encode an intent labeli∈ℐi\\in\\mathcal\{I\}, followed by slot\-type and slot\-value tokenss∈𝒮s\\in\\mathcal\{S\}\. The model is autoregressive and parameterized byθ\\theta, defining

pθ​\(y∣x\)=∏t=1Tpθ​\(yt∣y<t,x\)\.p\_\{\\theta\}\(y\\mid x\)=\\prod\_\{t=1\}^\{T\}p\_\{\\theta\}\(y\_\{t\}\\mid y\_\{<t\},x\)\.Because intent tokens precede slot tokens in the decoding order, the joint distribution factorizes as

pθ​\(i,s\|x\)=pθ​\(i\|x\)​pθ​\(s\|i,x\),p\_\{\\theta\}\(i,s\|x\)=p\_\{\\theta\}\(i\|x\)\\,p\_\{\\theta\}\(s\|i,x\),where slot generation is conditioned jointly on the acoustic input and the decoded intent prefix\. We define the semantic capability associated with intentiias the conditional mapping

Cθ​\(i\)​\(x\)=pθ​\(s\|i,x\),C\_\{\\theta\}\(i\)\(x\)=p\_\{\\theta\}\(s\|i,x\),that characterizes the model’s ability to generate slot values given the intent and acoustic input\. This definition distinguishes capability from marginal intent prediction\. Suppressing the marginal probabilitypθ​\(i\|x\)p\_\{\\theta\}\(i\|x\)does not necessarily eliminate the conditional mappingpθ​\(s\|i,x\)p\_\{\\theta\}\(s\|i,x\)when the intent prefix is externally supplied during decoding\. This motivates the problem ofselective capability unlearning\.

### 2\.1Selective Capability Unlearning

LetIF∈ℐI\_\{F\}\\in\\mathcal\{I\}denote a target intent whose associated capability must be removed\. The training corpus is partitioned into a forget set𝒟F\\mathcal\{D\}\_\{F\}, containing samples labeled withIFI\_\{F\}, and a retain set𝒟R\\mathcal\{D\}\_\{R\}, containing all remaining intents\. Given pretrained parametersθ0\\theta\_\{0\}, the objective is to obtain parametersθ∗\\theta^\{\*\}such that:

1. ①Capability Erasure\. The conditional mappingpθ∗​\(s∣if,x\)p\_\{\\theta^\{\*\}\}\(s\\mid i\_\{f\},x\)must be substantially degraded for inputsx∼𝒟Fx\\sim\\mathcal\{D\}\_\{F\}, such that even when the intent prefixIFI\_\{F\}is externally supplied, the model can no longer generate correct slots\.
2. ②Capability Retention\. For alli≠ifi\\neq i\_\{f\}, the joint behaviorpθ∗​\(i,s∣x\)p\_\{\\theta^\{\*\}\}\(i,s\\mid x\)should remain close to that underθ0\\theta\_\{0\}, thereby preserving performance on all retained intents\.

In other words, our objective is to reduce the conditional mapping associated with the target intent while preserving the remaining capabilities of the model\.

### 2\.2Failure Mode: Capability Persistence

Existing unlearning methods typically suppress the marginal likelihoodpθ​\(if∣x\)p\_\{\\theta\}\(i\_\{f\}\\mid x\)of the target intent\. However, this does not necessarily modify the conditional slot\-generation distributionpθ​\(s∣if,x\)p\_\{\\theta\}\(s\\mid i\_\{f\},x\)\. This limitation arises from the autoregressive decoding\. For slot positionst\>tintentt\>t\_\{\\text\{intent\}\}, the decoder hidden stateht=fθ​\(x,if,s<t\)h\_\{t\}=f\_\{\\theta\}\(x,i\_\{f\},s\_\{<t\}\)depends jointly on the acoustic inputxx, the intent prefixIFI\_\{F\}, and previously generated tokens\. The next\-token probability is computed as

pθ​\(yt∣y<t,x\)=softmax​\(W​ht\+b\)\.p\_\{\\theta\}\(y\_\{t\}\\mid y\_\{<t\},x\)=\\mathrm\{softmax\}\(Wh\_\{t\}\+b\)\.Here,fθf\_\{\\theta\}denotes the decoder network parameterized byθ\\theta, whileWWandbbare the output projection parameters mapping hidden states to vocabulary logits\. Because representation directions inhth\_\{t\}encode the intent\-slot associations learned during training, suppressingpθ​\(if∣x\)p\_\{\\theta\}\(i\_\{f\}\\mid x\)does not necessarily modify these directions\. As shown in Table[1](https://arxiv.org/html/2606.24063#S3.T1), intent\-level suppression reduces intent accuracy but leaves conditional slot recoverability largely intact\. Consequently, the conditional mappingpθ​\(s∣if,x\)p\_\{\\theta\}\(s\\mid i\_\{f\},x\)remains active, and correct slots can still be generated under forced\-prefix decoding\. We refer to this structural phenomenon as*capability persistence*\. Effective selective unlearning should therefore intervene in the representation space to weaken intent\-conditioned slot generation rather than only suppress marginal intent prediction\.

## 3BSU: Binding Subspace Unlearning

Building on this formulation, we introduceBinding Subspace Unlearning \(BSU\), a two\-stage unlearning framework that intervenes in the model's representation space to mitigate capability persistence\. Rather than suppressing only the marginal intent probabilitypθ​\(if∣x\)p\_\{\\theta\}\(i\_\{f\}\\mid x\), BSU targets hidden\-state directions associated with intent\-conditioned slot generation\. Letθ0\\theta\_\{0\}be a pretrained SLU model andθ\\thetaa trainable copy initialized fromθ0\\theta\_\{0\}\. Given a target intentIFI\_\{F\}, the objective is to reduce the conditional distributionpθ​\(s∣if,x\)p\_\{\\theta\}\(s\\mid i\_\{f\},x\)under forced\-prefix decoding while preserving performance on retained intents\. BSU achieves this through two stages:\(i\)identifying representation directions enriched when the target intentIFI\_\{F\}is present, and\(ii\)reducing model sensitivity along these directions to weaken the conditional mapping\.

### 3\.1Stage I: Binding Subspace Identification

The goal of Stage I is to identify representation directions that are statistically enriched when the target intentIFI\_\{F\}is present\. These directions approximate components of the hidden state that influence the conditional slot\-generation distributionpθ​\(s∣if,x\)p\_\{\\theta\}\(s\\mid i\_\{f\},x\)\. Since next\-token probabilities are computed from hidden states, analyzing hidden representations at slot positions allows us to probe directions associated with intent\-conditioned slot generation\.

\(1\) Hidden\-State Extraction\.Semantic outputs follow a structured prefix format consisting of an intent token followed by slot tokens\. We estimate the boundary between intent and slot tokens and extract hidden states at slot positions using teacher\-forced decoding\[williams1989learning\]:

ht\(ℓ\)=fθ\(ℓ\)​\(x,y<t\),t\>tprefix\.h\_\{t\}^\{\(\\ell\)\}=f\_\{\\theta\}^\{\(\\ell\)\}\(x,y\_\{<t\}\),\\qquad t\>t\_\{\\text\{prefix\}\}\.Here,ht\(ℓ\)h\_\{t\}^\{\(\\ell\)\}denotes the decoder representation at layerℓ\\ell\. Teacher forcing ensures alignment between hidden states and ground truth slot structure, avoiding variability induced by decoding errors\.

\(2\) Covariance Contrast\.Using these slot\-position representationsht\(ℓ\)h\_\{t\}^\{\(\\ell\)\}, we compute empirical covariance matrices over hidden states for both the forget dataset𝒟F\\mathcal\{D\}\_\{F\}and the retain dataset𝒟R\\mathcal\{D\}\_\{R\}at each layerℓ\\ell\. We then form the contrast matrix as:

M\(ℓ\)=Cov𝒟F\(ℓ\)−Cov𝒟R\(ℓ\)M^\{\(\\ell\)\}=\\mathrm\{Cov\}\_\{\\mathcal\{D\}\_\{F\}\}^\{\(\\ell\)\}\-\\mathrm\{Cov\}\_\{\\mathcal\{D\}\_\{R\}\}^\{\(\\ell\)\}This contrast highlights representation directions that have higher variance when the target intentIFI\_\{F\}is present relative to retained data\. Since next\-token probabilities are computed as affine transformations of hidden states, directions that consistently vary underDFD\_\{F\}are expected to contribute more strongly to the conditional slot likelihoodpθ​\(s∣if,x\)p\_\{\\theta\}\(s\\mid i\_\{f\},x\)\.

\(3\) Subspace Extraction\.Finally, we compute the top positive eigenvectors ofM\(ℓ\)M^\{\(\\ell\)\}, denoted asU\(ℓ\)∈ℝd×kU^\{\(\\ell\)\}\\in\\mathbb\{R\}^\{d\\times k\}\. These eigenvectors define a low\-dimensional subspace that approximates representation directions associated with intent\-conditioned slot generation\. This construction is statistical rather than exact; it identifies directions enriched underIFI\_\{F\}without assuming strict disentanglement of representations\.

### 3\.2Stage II: Subspace\-Guided Capability Attenuation

Stage II attenuates intent\-conditioned slot generation by reducing model sensitivity along the identified binding subspacesU\(ℓ\)U^\{\(\\ell\)\}\. Rather than modifying only output distributions, BSU targets the first\-order sensitivity of the conditional likelihood with respect to hidden states\. For slot positions associated withIFI\_\{F\}, we compute the gradient of the teacher\-forced conditional log\-likelihood:

gt\(ℓ\)=∇ht\(ℓ\)log⁡pθ​\(s∣if,x\)g\_\{t\}^\{\(\\ell\)\}=\\nabla\_\{h\_\{t\}^\{\(\\ell\)\}\}\\log p\_\{\\theta\}\(s\\mid i\_\{f\},x\)If slot generation underIFI\_\{F\}relies on particular representation directions, the gradient will exhibit large components along them\. We therefore project the gradients onto the binding subspaces and penalize their squared magnitude:

ℒbind=∑ℓ=1L∑t\>tprefix‖U\(ℓ\)​U\(ℓ\)⊤​gt\(ℓ\)‖22\\mathcal\{L\}\_\{\\mathrm\{bind\}\}=\\sum\_\{\\ell=1\}^\{L\}\\sum\_\{t\>t\_\{\\mathrm\{prefix\}\}\}\\left\\lVert U^\{\(\\ell\)\}U^\{\(\\ell\)\\top\}g\_\{t\}^\{\(\\ell\)\}\\right\\rVert\_\{2\}^\{2\}Minimizingℒbind\\mathcal\{L\}\_\{\\mathrm\{bind\}\}reduces sensitivity along representation directions enriched underIFI\_\{F\}, weakening intent\-conditioned slot generation even when the intent prefix is externally supplied\. The attenuation is achieved through parameter updates and introduces no inference\-time overhead\.

Table 1:Selective capability unlearning onSLURPandSpeechMassive\. We report intent accuracy \(IFI\_\{F\}\), slot macro\-F1 \(F​1FF1\_\{F\}\), BRR@10, and semantic similarity \(Sim\.\) on the targetDFD\_\{F\}and retainDRD\_\{R\}sets, respectively\.Bold: Best,Underlined:Second\-Best\.
### 3\.3Overall Unlearning Objective

For the final update step, we build on the standard forget–retain fine\-tuning objective\[ren2025sok,kurmanji2023towards\]\. LetLF=−𝔼\(x,y\)∼DF​log⁡pθ​\(y∣x\)L\_\{F\}=\-\\mathbb\{E\}\_\{\(x,y\)\\sim D\_\{F\}\}\\log p\_\{\\theta\}\(y\\mid x\)andLR=−𝔼\(x,y\)∼DR​log⁡pθ​\(y∣x\)L\_\{R\}=\-\\mathbb\{E\}\_\{\(x,y\)\\sim D\_\{R\}\}\\log p\_\{\\theta\}\(y\\mid x\)denote the negative log\-likelihood losses on the forget and retain sets, respectively\. Since minimizingLFL\_\{F\}would further fit the forget examples, we reverse its sign in the unlearning objective\. Thus, gradient descent on the total objective performs gradient ascent onLFL\_\{F\}, suppressing the target capability, whileLRL\_\{R\}preserves performance on non\-target intents:

Lbase=−LF\+λret​LR\.\\small L\_\{\\mathrm\{base\}\}=\-L\_\{F\}\+\\lambda\_\{\\mathrm\{ret\}\}L\_\{R\}\.To stabilize optimization and limit deviation from the original modelθ0\\theta\_\{0\}, we introduce a retain\-set KL regularizerLklL\_\{\\mathrm\{kl\}\}, which keeps the updated model close to the original distribution onDRD\_\{R\}\. We then incorporate the binding lossLbindL\_\{\\mathrm\{bind\}\}from Section 3\.2 to reduce sensitivity along intent\-conditioned slot\-generation directions\. The final objective is

L=−LF\+λret​LR\+λkl​Lkl\+λbind​Lbind\.\\small L=\-L\_\{F\}\+\\lambda\_\{\\mathrm\{ret\}\}L\_\{R\}\+\\lambda\_\{\\mathrm\{kl\}\}L\_\{\\mathrm\{kl\}\}\+\\lambda\_\{\\mathrm\{bind\}\}L\_\{\\mathrm\{bind\}\}\.All terms are optimized with standard gradient descent; the negative coefficient onLFL\_\{F\}induces ascent on the forget\-set NLL, while the retain, KL, and binding terms are minimized normally\.

## 4Experimental Setup

### 4\.1Datasets and Task Setup

Datasets\.We evaluate selective capability unlearning on SLURP\[bastianelli2020slurp\], a standard end\-to\-end SLU benchmark dataset with each utterance annotated by an intent and corresponding slot\-value pairs\. To assess cross\-lingual robustness, we additionally evaluate the French subset of SpeechMASSIVE\[lee2024speech\], which follows the same semantic mapping\. For each dataset, we select a target intentIFI\_\{F\}and partition the data into a forget setDFD\_\{F\}\(utterances labeled withIFI\_\{F\}\) and a retain setDRD\_\{R\}\(remaining intents\)\.

Implementation Details\.All experiments use an end\-to\-end SLU architecture consisting of a Conformer acoustic encoder and a Transformer\-based semantic decoder\. We evaluate two model variants that share the same architecture, tokenizer, decoder, and training protocol, differing only in encoder initialization\. In the\(i\) ASR\-initialized SLUsetting, the encoder is initialized from a supervised ASR checkpoint \(Conformer–Transformer–Large, NeMo ASR\-Set 3\.0\)\. In the\(ii\) SSL\-initialized SLUsetting, the encoder is instead initialized from a self\-supervised speech representation while keeping all other components identical\. Unless otherwise stated, we useλret=1\.0\\lambda\_\{\\mathrm\{ret\}\}=1\.0,λkl=0\.1\\lambda\_\{\\mathrm\{kl\}\}=0\.1, andλbind=0\.5\\lambda\_\{\\mathrm\{bind\}\}=0\.5\.

### 4\.2Evaluation Metrics

We evaluate selective capability unlearning using both surface\-level and behavioral metrics\. Surface metrics include intent accuracy and slot F1 following the SLURP protocol\[bastianelli2020slurp\]\. To measure residual capability beyond exact prediction, we adopt top\-k decoding and embedding\-based similarity, widely used metrics in generative unlearning\[chen2021evaluating,zhang2019bertscore,yao2024large,carlini2021extracting,holtzman2019curious\]\.

BRR@10 \(Beam Retrieval Rate\[vijayakumar2018diverse\]\)BRR@10 measures whether the forgotten capability remains recoverable under beam search\. For each utterancex∈𝒟Fx\\in\\mathcal\{D\}\_\{F\}, we decode conditioned on the ground truth intent prefixIFI\_\{F\}\. Letyybe the ground truth semantic frame andy^\(k\)\\hat\{y\}^\{\(k\)\}thekk\-th beam hypothesis with beam sizeK=10K=10\. We define:

BRR@10=1\|𝒟F\|∑\(x,y\)∈𝒟F𝟏\(∃k≤K:y^\(k\)=y\)\\mathrm\{BRR@10\}=\\frac\{1\}\{\|\\mathcal\{D\}\_\{F\}\|\}\\sum\_\{\(x,y\)\\in\\mathcal\{D\}\_\{F\}\}\\mathbf\{1\}\\left\(\\exists\\,k\\leq K:\\hat\{y\}^\{\(k\)\}=y\\right\)
Semantic Similarity\[reimers2019sentence\]To capture residual behavior beyond exact matching, we compute cosine similarity between embeddingsE​\(y^\)E\(\\hat\{y\}\)andE​\(y\)E\(y\), whereE​\(⋅\)E\(\\cdot\)maps structured semantic frames to vector representations\. The cosine similarity is defined as follows:

Sim​\(y^,y\)=E​\(y^\)⊤​E​\(y\)∥E​\(y^\)∥2​∥E​\(y\)∥2\{\\tiny\\mathrm\{Sim\}\(\\hat\{y\},y\)=\\frac\{E\(\\hat\{y\}\)^\{\\top\}E\(y\)\}\{\\lVert E\(\\hat\{y\}\)\\rVert\_\{2\}\\,\\lVert E\(y\)\\rVert\_\{2\}\}\}A higherBRR​@​10\\mathrm\{BRR\}@10indicates recoverable capability under beam search, while higher semantic similarity reflects stronger alignment with the ground\-truth frame despite surface differences\.

### 4\.3Baselines

As no prior work explicitly studies capability unlearning in end\-to\-end SLU, we adapt canonical machine unlearning methods for our setting\. We evaluate the Gradient Ascent \(GA\) family, which maximizes the loss on the forget set to disrupt target representations\[golatkar2020eternal,yao2024large\], along with two stabilized variants: GA\+GD, which additionally trains on retain data, and GA\+KL, which applies KL regularization to the original model\. We further include Negative Preference Optimization \(NPO\) and NPO\+KL, which penalizes distribution alignment with forget\-set outputs via a preference objective, and Random Label \(RLabel\), which replaces forget\-set labels with random targets to simulate capability removal\[graves2021amnesiac\]\.

![Refer to caption](https://arxiv.org/html/2606.24063v1/x1.png)\(a\)Effect on Forget Set
![Refer to caption](https://arxiv.org/html/2606.24063v1/x2.png)\(b\)Effect on Retain Set

Figure 2:Effect of the binding regularizerλb​i​n​d\\lambda\_\{bind\}\. Increasingλb​i​n​d\\lambda\_\{bind\}suppresses the target capability on the forget set \(DFD\_\{F\}\), reducing all metrics, while performance on the retain set \(DRD\_\{R\}\) remains largely stable\.![Refer to caption](https://arxiv.org/html/2606.24063v1/Figures/beam_position_plot.png)Figure 3:Exact recovery across beam positions\.We report the Beam Retrieval Rate \(BRR@10\), evaluating whether the correct semantic frame appears within the top\-K beam hypotheses\. Higher values indicate stronger target recoverability\.

## 5Results and Analysis

### 5\.1Comparison with Baselines

We evaluate whether slot content associated with the target intent can still be generated when the intent prefix is provided at test time, while the performance on non\-target intents is preserved\. As shown in Table[1](https://arxiv.org/html/2606.24063#S3.T1), gradient and preference\-based baselines \(GA, GA\+KL, NPO\) reduce marginal intent prediction on the forget set \(DFD\_\{F\}\) in several settings\. However, their forced\-prefix recovery scores often remain high, indicating that intent\-conditioned slot content can still be recovered under forced\-prefix decoding\. In contrast,BSU \(ours\)leads to pronounced reductions across all forget\-set metrics\. For SLURP with NeMo Conformer\-Transformer, BRR@10 decreases from 92\.64 to 22\.10 and semantic similarity decreases from 90\.14 to 24\.80\. Comparable reductions are observed with the SSL\-initialized Conformer, where BRR@10 decreases from 84\.42 to 16\.30 and semantic similarity decreases from 83\.27 to 18\.70\. The same trend is observed on SpeechMassive, where BSU also reduces BRR@10 and semantic similarity across both model initializations\. At the same time, retain\-set performance remains stable relative to the original and retrain baselines in most settings, indicating selective unlearning\.

### 5\.2Residual Capability Analysis

Random Space Analysis\.We include a Random Space \(RS\) ablation in Table[1](https://arxiv.org/html/2606.24063#S3.T1)to test whether unstructured perturbations in the representation space can induce forgetting\. RS perturbs representations along random directions that are not aligned with the subspace encoding the target intent–slot dependency\. Consequently, the perturbation does not systematically disrupt the underlying capability, resulting in limited suppression\.

Binding Regularizer Sensitivity\.Figure[2](https://arxiv.org/html/2606.24063#S4.F2)analyzes asλb​i​n​d\\lambda\_\{bind\}increases, performance on the forget split \(IFI\_\{F\},F​1FF1\_\{F\}, BRR@10\) decreases steadily, indicating progressively stronger suppression of the target capability\. In contrast, metrics on the retain split remain largely stable across the same range ofλb​i​n​d\\lambda\_\{bind\}\.

Residual Capability Recovery\.Figure[3](https://arxiv.org/html/2606.24063#S4.F3)demonstrates whether the target capability can be regenerated during decoding by leveraging the beam position of the first exact semantic\-frame recovery\. The stacked bars show the fraction of forget\-set \(DFD\_\{F\}\) samples whose correct frame first appears at beam positionkk\(1≤k≤101\\leq k\\leq 10\)\. The original model recovers the correct frame for a large portion of samples at early beam positions, indicating strong memorization of the target behavior\. Both retraining and BSU significantly lower recovery across all beam positions, indicating that exact recovery is strongly reduced, even along alternative decoding paths\.

## 6Conclusion

In this work, we show that suppressing marginal intent prediction alone does not eliminate the conditional mapping governing slot generation, leading to capability persistence under forced\-prefix decoding\. We proposeBinding Subspace Unlearning \(BSU\), which removes this dependency by targeting representation\-level binding directions\. Experiments show substantial reductions in recoverable slot behavior while preserving retained\-intent performance\. Future work includes extending this approach to broader generative tasks and developing principled capability\-level unlearning methods for privacy\-centric applications\.

## 7Acknowledgments

We acknowledge the institutional and computational support provided by the Department of Data Science and Engineering, Indian Institute of Science Education and Research Bhopal\.

## 8Generative AI Use Disclosure

Generative AI tools were used only for language editing and polishing\. All scientific content, experimental design, analyses, results, and conclusions were developed and verified by the authors, who take full responsibility for the content of this paper\.

## References

Similar Articles

SPACE: Source-free Proxy Anchor Concept Erasure for MLLMs

arXiv cs.LG

This paper introduces SPACE, the first source-free unlearning framework for multimodal large language models (MLLMs), which uses text-guided proxy anchor selection and dual-constraint semantic isolation to erase target concepts without requiring access to original training data, achieving performance comparable to data-dependent methods.

ReAD: Reinforcement-Guided Capability Distillation for Large Language Models

arXiv cs.CL

This paper introduces ReAD, a reinforcement-guided capability distillation framework that optimizes token budgets by accounting for cross-capability transfer in large language models. It demonstrates improved downstream utility and reduced harmful spillover compared to existing baselines.