Tag
This paper introduces Gradient Times Difference from Reference (GXD), a theoretically motivated utility measure for attributing neuron utility to restore plasticity in deep networks during continual learning. It argues that GXD provides more reliable intervention cost estimation compared to existing proxy signals like activation magnitude.