knowledge-removal

Tag

Cards List
#knowledge-removal

Model Unlearning Objectives Vary for Distinct Language Functions

arXiv cs.CL · 2026-05-27 Cached

The paper argues that unlearning in LLMs should be goal-dependent, proposing a cosine-based meta-learned variant of RMU for dangerous knowledge and a multi-layer objective with probe directions for toxicity, achieving strong results across four 7-8B models.

0 favorites 0 likes
← Back to home

Submit Feedback