Tag
This paper proposes a continual learning method for LLMs that uses pretrained sparse autoencoders (SAEs) to regularize in activation space instead of weight space, achieving better memory efficiency and stronger performance on benchmarks while avoiding catastrophic forgetting without storing previous data.
This paper proposes H-Res, a method to adapt large transformer models by shaping the energy landscape of associative memories without modifying weights or adding prompts, preserving memory capacity and outperforming LoRA.
DO-ALL is a plug-and-play framework that uses dataset distillation to generate synthetic anchors that summarize source data, enabling stable long-term continual test-time adaptation without retaining original source data.
Proposes ReGrad, a paradigm that treats gradients as retrievable units of knowledge for continual post-training, avoiding cumulative weight drift by storing document-specific gradients in a Gradient Bank and retrieving query-relevant gradients for temporary weight adaptation.
This paper rethinks backdoor unlearning from a continual learning perspective, defining complete backdoor unlearning and proposing Blind Inversion-Backdoor Adversarial Unlearning (BI-BAU) that integrates adversarial training into an EM algorithm to effectively eliminate backdoor effects across various attack types and modalities.
An overview of the current state and future outlook of continual learning in mid-2026, covering memory approaches including external memory, in-state memory, and weight updates, with analysis of various models like TTT, Titans, and Dragon Hatchling.
This paper provides a comprehensive survey of Federated Continual Learning (FCL), an emerging field that combines Federated Learning and Continual Learning to enable lifelong, adaptive, and privacy-preserving learning over distributed and non-stationary data. It proposes a taxonomy, reviews applications, metrics, and open challenges.
Pyrecall is a new open-source tool that detects catastrophic forgetting during LLM fine-tuning by snapshotting skill scores before and after training, flagging regressions, and rolling back LoRA adapters. It runs fully locally with no external APIs.
This editorial discusses the resurgence of continual learning in LLMs, highlighting the need for offline consolidation (or 'sleep') to prevent catastrophic forgetting and enable models to stay current and specialized after deployment.
This paper argues that catastrophic forgetting in neural networks is not erasure but an interface alignment problem. It introduces 'transport keys' to recover latent task-specific features from sequentially trained models, demonstrating significant performance recovery on split CIFAR-100.
RAFT is a two-stage framework for domain-specific fine-tuning of LLMs that addresses catastrophic forgetting by refining supervision data and using on-policy distillation with adaptive loss balancing, achieving significant improvements on domain accuracy while recovering general capabilities.
This paper studies catastrophic forgetting in multilingual expert language models during continual pretraining and proposes five parameter alignment strategies (hard layer freezing, soft regularization, post-hoc weight reversion, and model merging) to mitigate forgetting across 32 training languages with minimal cost to language acquisition.
This paper proposes a local perturbation theory to explain cross-domain interference in multi-domain RL for LLMs, showing that interference is driven by a second-order damage term in a low-dimensional conflict subspace, and demonstrates that brief domain refresh or training-free rollback can selectively recover lost capabilities.
A discussion on why newer state-of-the-art AI models are performing worse on the Vendingbench benchmark, suggesting factors such as cheating in earlier runs, ethical alignment reducing profit-seeking behavior, and catastrophic forgetting due to overemphasis on coding.
This paper investigates the mechanistic origins of catastrophic forgetting in LLMs, finding that reinforcement learning preserves internal computational circuits better than supervised fine-tuning, resulting in less forgetting of prior capabilities.
This paper systematically studies HiF8 W8A8 quantization-aware training for OpenPangu-Embedded-1B, identifying and addressing failure modes such as amax saturation and catastrophic forgetting, achieving near-lossless performance with a 64-step max-algorithm DTS strategy and a 500-step BF16 warmup.
This paper proposes DG-Hard, a post-hoc spectral repair method that recovers capabilities damaged by fine-tuning without retraining, using only the pretrained and fine-tuned checkpoints. It applies Donoho-Gavish hard singular-value thresholding to weight updates to remove noise and restore degraded performance.
MeMo introduces a modular memory model that augments any LLM to store, retrieve, and integrate new knowledge without retraining or catastrophic forgetting. It outperforms RAG-based methods on benchmarks like BrowseComp-Plus, NarrativeQA, and MuSiQue.
KappaTune, a fine-tuning method designed to mitigate catastrophic forgetting, has been integrated into Hugging Face's PEFT library.
This paper introduces a diagnostic framework using Sparse Autoencoders to analyze concept-level forgetting in continual learning, finding that much forgetting is due to representational inaccessibility rather than erasure.