Learning, Fast and Slow: Towards LLMs That Adapt Continually

Hugging Face Daily Papers 05/12/26, 12:00 AM Papers

Summary

A fast-slow learning framework for LLMs combines fixed slow weights with optimized fast context weights, achieving up to 3x better sample efficiency and reduced catastrophic forgetting in continual learning scenarios.

Large language models (LLMs) are trained for downstream tasks by updating their parameters (e.g., via RL). However, updating parameters forces them to absorb task-specific information, which can result in catastrophic forgetting and loss of plasticity. In contrast, in-context learning with fixed LLM parameters can cheaply and rapidly adapt to task-specific requirements (e.g., prompt optimization), but cannot by itself typically match the performance gains available through updating LLM parameters. There is no good reason for restricting learning to being in-context or in-weights. Moreover, humans also likely learn at different time scales (e.g., System 1 vs 2). To this end, we introduce a fast-slow learning framework for LLMs, with model parameters as "slow" weights and optimized context as "fast" weights. These fast "weights" can learn from textual feedback to absorb the task-specific information, while allowing slow weights to stay closer to the base model and persist general reasoning behaviors. Fast-Slow Training (FST) is up to 3x more sample-efficient than only slow learning (RL) across reasoning tasks, while consistently reaching a higher performance asymptote. Moreover, FST-trained models remain closer to the base LLM (up to 70% less KL divergence), resulting in less catastrophic forgetting than RL-training. This reduced drift also preserves plasticity: after training on one task, FST trained models adapt more effectively to a subsequent task than parameter-only trained models. In continual learning scenarios, where task domains change on the fly, FST continues to acquire each new task while parameter-only RL stalls.

Original Article

View Cached Full Text

Cached at: 05/13/26, 08:14 PM

Paper page - Learning, Fast and Slow: Towards LLMs That Adapt Continually

Source: https://huggingface.co/papers/2605.12484

Abstract

A fast-slow learning framework for large language models combines fixed parameters with optimized context to achieve better sample efficiency, reduced catastrophic forgetting, and improved adaptability in continual learning scenarios.

Large language models(LLMs) are trained for downstream tasks by updating their parameters (e.g., via RL). However, updating parameters forces them to absorb task-specific information, which can result incatastrophic forgettingand loss of plasticity. In contrast,in-context learningwith fixed LLM parameters can cheaply and rapidly adapt to task-specific requirements (e.g., prompt optimization), but cannot by itself typically match the performance gains available through updating LLM parameters. There is no good reason for restricting learning to being in-context or in-weights. Moreover, humans also likely learn at different time scales (e.g., System 1 vs 2). To this end, we introduce afast-slow learning frameworkfor LLMs, with model parameters as “slow” weights and optimized context as “fast” weights. These fast “weights” can learn from textual feedback to absorb the task-specific information, while allowingslow weightsto stay closer to the base model and persist general reasoning behaviors. Fast-Slow Training (FST) is up to 3x more sample-efficient than only slow learning (RL) across reasoning tasks, while consistently reaching a higher performance asymptote. Moreover, FST-trained models remain closer to the base LLM (up to 70% lessKL divergence), resulting in lesscatastrophic forgettingthan RL-training. This reduced drift also preserves plasticity: after training on one task, FST trained models adapt more effectively to a subsequent task than parameter-only trained models. Incontinual learningscenarios, where task domains change on the fly, FST continues to acquire each new task while parameter-only RL stalls.

View arXiv page View PDF Add to collection

Get this paper in your agent:

hf papers read 2605\.12484

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2605.12484 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2605.12484 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2605.12484 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

Learning, Fast and Slow: Towards LLMs That Adapt Continually

Paper page - Learning, Fast and Slow: Towards LLMs That Adapt Continually

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper0

Similar Articles

Learning, Fast and Slow: Towards LLMs That Adapt Continually [R]

@MihaelaVDS: Can LLMs keep learning new skills without updating their weights? Modern LLMs can already master & combine many skills.…

@LakshyAAAgrawal: Learning from rich textual feedback (errors, traces, partial reasoning) beats scalar reward alone for LLM optimization.…

From Weights to Features: SAE-Guided Activation Regularization for LLM Continual Learning

Personal continual learning for LLMs without GPU — position paper [OC]

Submit Feedback

Similar Articles

Learning, Fast and Slow: Towards LLMs That Adapt Continually [R]

@MihaelaVDS: Can LLMs keep learning new skills without updating their weights? Modern LLMs can already master & combine many skills.…

@LakshyAAAgrawal: Learning from rich textual feedback (errors, traces, partial reasoning) beats scalar reward alone for LLM optimization.…

From Weights to Features: SAE-Guided Activation Regularization for LLM Continual Learning

Personal continual learning for LLMs without GPU — position paper [OC]