unlearning

#unlearning

Forgetful Attention: A Trainable Support-Vector Memory with Certified Selection and Exact Unlearning

arXiv cs.LG ↗ · 2026-07-15 Cached

Introduces Support Vector Attention (SV-Attention), a trainable max-margin memory that provides certified selection of tokens with zero weight and exact unlearning via a reversible incremental solver. It achieves improved rare-item recall and patient-record deletion capabilities.

0 favorites 0 likes

#unlearning

Modular Pretraining Enables Access Control

arXiv cs.LG ↗ · 2026-07-10 Cached

This paper introduces GRAM (gradient-routed auxiliary modules), a modular pretraining method that enables access control by selectively adding and ablating modules to limit dual-use capabilities in AI models, showing cost reductions compared to data filtering.

0 favorites 0 likes

#unlearning

Multimodal Unlearning Across Vision, Language, Video, and Audio: Survey of Methods, Datasets, and Benchmarks

arXiv cs.LG ↗ · 2026-07-10 Cached

A comprehensive survey of methods, datasets, and benchmarks for multimodal unlearning across vision, language, video, and audio, providing a taxonomy and highlighting open problems.

0 favorites 0 likes

#unlearning

Auditing Forgetting in Limited Memory Language Models

arXiv cs.CL ↗ · 2026-07-02 Cached

This paper proposes a causal auditing framework to evaluate forgetting in Limited Memory Language Models by varying the database state during inference, discovering that parametric leakage is negligible and post-deletion correctness primarily arises from retrieval artifacts rather than residual parametric memory.

0 favorites 0 likes

#unlearning

Probing Stylistic Appropriation using Large Language Models: An Evaluation Framework for Copyright Infringement under EU Law

arXiv cs.CL ↗ · 2026-07-01 Cached

This paper introduces PSALM, an LLM-as-a-judge framework that evaluates copyright infringement under EU law by assessing stylistic and narrative appropriation beyond verbatim memorization, finding that fine-tuning induces systematic stylistic similarity and that existing safeguards are insufficient.

0 favorites 0 likes

#unlearning

PreUnlearn: Auditing Collateral Knowledge Damage Before Large Language Model Unlearning

arXiv cs.CL ↗ · 2026-06-18 Cached

This paper proposes PreUnlearn, a framework for auditing collateral knowledge damage in LLM unlearning before execution, using data-centric analysis to predict downstream damage across semantic layers.

0 favorites 0 likes

#unlearning

Fair Cognitive Impairment Detection Through Unlearning

arXiv cs.LG ↗ · 2026-06-18 Cached

Proposes a multimodal framework for fair Mild Cognitive Impairment detection from speech, using unlearning via gradient reversal to reduce demographic bias and improve performance across subgroups.

0 favorites 0 likes

#unlearning

Replay What Matters: Off-Policy Replay for Efficient LLM Reinforcement Unlearning

arXiv cs.CL ↗ · 2026-06-16 Cached

This paper introduces ReRULE, an off-policy replay method for reinforcement unlearning in LLMs, improving forgetting and retention efficiency on benchmarks like RWKU and MUSE.

0 favorites 0 likes

#unlearning

Natively Unlearnable Large Language Models

arXiv cs.LG ↗ · 2026-06-15 Cached

The paper proposes NULLs (Natively Unlearnable LLMs), a model class that isolates source-specific contributions in sparsely activated sinks while sharing backbone neurons, enabling clean unlearning of individual data sources without retraining and preserving general language capabilities.

0 favorites 0 likes

#unlearning

MLUBench: A Benchmark for Lifelong Unlearning Evaluation in MLLMs

arXiv cs.AI ↗ · 2026-06-12 Cached

MLUBench is a large-scale benchmark for lifelong unlearning in multimodal large language models (MLLMs), featuring 127 entities across 9 classes. The paper identifies that existing unlearning methods suffer from cumulative degradation and proposes LUMoE to mitigate this, showing significant improvements.

0 favorites 0 likes

#unlearning

Null-Space Constrained Low-Rank Adaptation for Response-Specified Large Language Model Unlearning

arXiv cs.AI ↗ · 2026-06-10 Cached

This paper introduces Null-Space Constrained Response-Specified Unlearning (NSRU), a low-rank framework that uses orthogonal-projected LoRA updates confined to the null space of retain subspaces to perform controlled LLM unlearning while preserving benign capabilities.

0 favorites 0 likes

#unlearning

Multilingual Unlearning in LLMs: Transfer, Dynamics, and Reversibility

arXiv cs.CL ↗ · 2026-06-03 Cached

This paper studies multilingual unlearning in LLMs by extending the TOFU benchmark to five languages. It finds that unlearning transfer varies by script and family, operates primarily in later decoding layers, and that a single steering direction can recover much of the suppressed knowledge across languages.

0 favorites 0 likes

#unlearning

Geometric Erasure by Contrastive Velocity Matching in Rectified Flows

arXiv cs.LG ↗ · 2026-06-02 Cached

This paper introduces GEM, a concept erasure framework for Rectified Flow models that combines trajectory-based unlearning with teacher-guided flow matching, achieving 5× faster and safer content suppression while preserving benign generation.

0 favorites 0 likes

#unlearning

Model Unlearning Objectives Vary for Distinct Language Functions

arXiv cs.CL ↗ · 2026-05-27 Cached

The paper argues that unlearning in LLMs should be goal-dependent, proposing a cosine-based meta-learned variant of RMU for dangerous knowledge and a multi-layer objective with probe directions for toxicity, achieving strong results across four 7-8B models.

0 favorites 0 likes

unlearning

Submit Feedback