model-plasticity

Tag

Cards List
#model-plasticity

When RL Fails after SFT: Rejuvenating Model Plasticity for Robust SFT-to-RL Handoff

arXiv cs.LG · 2d ago Cached

This paper investigates the loss of model plasticity after excessive supervised fine-tuning (SFT) in the SFT-then-RL pipeline for LLMs, and proposes Rejuvenation, a method that restores plasticity via base-anchored model fusion and targeted neuron reset, consistently improving RL performance.

0 favorites 0 likes
← Back to home

Submit Feedback