Learning, Fast and Slow: Towards LLMs That Adapt Continually
Summary
A fast-slow learning framework for LLMs combines fixed slow weights with optimized fast context weights, achieving up to 3x better sample efficiency and reduced catastrophic forgetting in continual learning scenarios.
View Cached Full Text
Cached at: 05/13/26, 08:14 PM
Paper page - Learning, Fast and Slow: Towards LLMs That Adapt Continually
Source: https://huggingface.co/papers/2605.12484
Abstract
A fast-slow learning framework for large language models combines fixed parameters with optimized context to achieve better sample efficiency, reduced catastrophic forgetting, and improved adaptability in continual learning scenarios.
Large language models(LLMs) are trained for downstream tasks by updating their parameters (e.g., via RL). However, updating parameters forces them to absorb task-specific information, which can result incatastrophic forgettingand loss of plasticity. In contrast,in-context learningwith fixed LLM parameters can cheaply and rapidly adapt to task-specific requirements (e.g., prompt optimization), but cannot by itself typically match the performance gains available through updating LLM parameters. There is no good reason for restricting learning to being in-context or in-weights. Moreover, humans also likely learn at different time scales (e.g., System 1 vs 2). To this end, we introduce afast-slow learning frameworkfor LLMs, with model parameters as “slow” weights and optimized context as “fast” weights. These fast “weights” can learn from textual feedback to absorb the task-specific information, while allowingslow weightsto stay closer to the base model and persist general reasoning behaviors. Fast-Slow Training (FST) is up to 3x more sample-efficient than only slow learning (RL) across reasoning tasks, while consistently reaching a higher performance asymptote. Moreover, FST-trained models remain closer to the base LLM (up to 70% lessKL divergence), resulting in lesscatastrophic forgettingthan RL-training. This reduced drift also preserves plasticity: after training on one task, FST trained models adapt more effectively to a subsequent task than parameter-only trained models. Incontinual learningscenarios, where task domains change on the fly, FST continues to acquire each new task while parameter-only RL stalls.
View arXiv pageView PDFAdd to collection
Get this paper in your agent:
hf papers read 2605\.12484
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2605.12484 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2605.12484 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2605.12484 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
Learning, Fast and Slow: Towards LLMs That Adapt Continually [R]
This paper introduces a Fast-Slow Training framework for LLMs that combines parameter updates with optimized context to improve sample efficiency and reduce catastrophic forgetting during continual learning.
@MihaelaVDS: Can LLMs keep learning new skills without updating their weights? Modern LLMs can already master & combine many skills.…
Introduces 'skill neologisms', a method for enabling LLMs to learn new skills without weight updates, addressing catastrophic forgetting. Presented at ICML.
@LakshyAAAgrawal: Learning from rich textual feedback (errors, traces, partial reasoning) beats scalar reward alone for LLM optimization.…
Fast-Slow Training (FST) interleaves context optimization (via GEPA) with model weight updates via RL, achieving 3× sample efficiency over RL alone on math, code, and physics reasoning while preserving plasticity and enabling continual learning.
From Weights to Features: SAE-Guided Activation Regularization for LLM Continual Learning
This paper proposes a continual learning method for LLMs that uses pretrained sparse autoencoders (SAEs) to regularize in activation space instead of weight space, achieving better memory efficiency and stronger performance on benchmarks while avoiding catastrophic forgetting without storing previous data.
Personal continual learning for LLMs without GPU — position paper [OC]
The author proposes two architectures, Internal KV-Sphere Architecture (IKSA) and Background Micro Fine-Tuning (BMFT), for enabling LLMs to learn continually from personal interactions without GPU requirements and without catastrophic forgetting.