Tag
The paper identifies 'Load-Bearing Wall' dimensions in pre-trained models that retain task-specific knowledge not fully captured by task vectors in model merging, and proposes PACT (PreserveAnchoredCores) to preserve these cores, achieving state-of-the-art performance across benchmarks.
This paper proposes sparsity-induced adaptations to LoRA, including Cheap LoRA (cLA) and a chained circulant variant (c³LA), and provides theoretical generalization bounds along with empirical evaluations showing up to 10% training time reduction and 15% peak GPU memory savings while maintaining competitive performance.