pre-trained-models

Tag

Cards List
#pre-trained-models

PACT: Preserving Anchored Cores in Task-vectors for Model Merging

arXiv cs.LG · yesterday Cached

The paper identifies 'Load-Bearing Wall' dimensions in pre-trained models that retain task-specific knowledge not fully captured by task vectors in model merging, and proposes PACT (PreserveAnchoredCores) to preserve these cores, achieving state-of-the-art performance across benchmarks.

0 favorites 0 likes
#pre-trained-models

Beyond LoRA: Is Sparsity-Induced Adaptation Better?

arXiv cs.LG · 4d ago Cached

This paper proposes sparsity-induced adaptations to LoRA, including Cheap LoRA (cLA) and a chained circulant variant (c³LA), and provides theoretical generalization bounds along with empirical evaluations showing up to 10% training time reduction and 15% peak GPU memory savings while maintaining competitive performance.

0 favorites 0 likes
← Back to home

Submit Feedback