pre-trained-models

#pre-trained-models

PACT: Preserving Anchored Cores in Task-vectors for Model Merging

arXiv cs.LG ↗ · yesterday Cached

The paper identifies 'Load-Bearing Wall' dimensions in pre-trained models that retain task-specific knowledge not fully captured by task vectors in model merging, and proposes PACT (PreserveAnchoredCores) to preserve these cores, achieving state-of-the-art performance across benchmarks.

0 favorites 0 likes

#pre-trained-models

Beyond LoRA: Is Sparsity-Induced Adaptation Better?

arXiv cs.LG ↗ · 4d ago Cached

This paper proposes sparsity-induced adaptations to LoRA, including Cheap LoRA (cLA) and a chained circulant variant (c³LA), and provides theoretical generalization bounds along with empirical evaluations showing up to 10% training time reduction and 15% peak GPU memory savings while maintaining competitive performance.

0 favorites 0 likes

pre-trained-models

PACT: Preserving Anchored Cores in Task-vectors for Model Merging

Beyond LoRA: Is Sparsity-Induced Adaptation Better?

Submit Feedback