low-rank-adapters

Tag

Cards List
#low-rank-adapters

Always Learning, Always Mixing: Efficient and Simple Data Mixing All The Time

arXiv cs.CL · 2026-05-18 Cached

This paper introduces OP-Mix, a data mixing algorithm that uses low-rank adapters trained on the current model to cheaply simulate candidate data mixtures, enabling efficient and unified data mixing across pretraining, continual midtraining, and continual instruction tuning. OP-Mix consistently finds near-optimal mixtures while using a fraction of the compute of baselines, improving pretraining perplexity by 6.3% and reducing compute by 66-95% in continual learning settings.

0 favorites 0 likes
#low-rank-adapters

Low-Rank Adapters Initialization via Gradient Surgery for Continual Learning

arXiv cs.LG · 2026-05-14 Cached

The paper proposes Slice, a gradient-surgery-based initialization for LoRA adapters in continual learning that reconciles conflicting gradients from current and past tasks to reduce catastrophic forgetting, achieving better stability-plasticity trade-offs.

0 favorites 0 likes
← Back to home

Submit Feedback