Tag
This paper investigates the loss of model plasticity after excessive supervised fine-tuning (SFT) in the SFT-then-RL pipeline for LLMs, and proposes Rejuvenation, a method that restores plasticity via base-anchored model fusion and targeted neuron reset, consistently improving RL performance.