knowledge-removal

#knowledge-removal

Model Unlearning Objectives Vary for Distinct Language Functions

arXiv cs.CL ↗ · 2026-05-27 Cached

The paper argues that unlearning in LLMs should be goal-dependent, proposing a cosine-based meta-learned variant of RMU for dangerous knowledge and a multi-layer objective with probe directions for toxicity, achieving strong results across four 7-8B models.

0 favorites 0 likes

knowledge-removal

Model Unlearning Objectives Vary for Distinct Language Functions

Submit Feedback