representational-drift

#representational-drift

A Gravitational Interpretation of Fine-Tuning Reversion

arXiv cs.LG ↗ · 2d ago Cached

The paper proposes a gravitational interpretation for fine-tuning reversion, where early training creates dominant behavioral manifolds that later alignment only shallowly displaces, causing a persistent reversion direction. Experiments show that blocking this direction reduces harmfulness with minimal task cost.

0 favorites 0 likes

#representational-drift

Lost or Hidden? A Concept-Level Forgetting in Supervised Continual Learning

arXiv cs.LG ↗ · 2026-05-19 Cached

This paper introduces a diagnostic framework using Sparse Autoencoders to analyze concept-level forgetting in continual learning, finding that much forgetting is due to representational inaccessibility rather than erasure.

0 favorites 0 likes

representational-drift

A Gravitational Interpretation of Fine-Tuning Reversion

Lost or Hidden? A Concept-Level Forgetting in Supervised Continual Learning

Submit Feedback