family-experts

Tag

Cards List
#family-experts

Parameter Alignment Mitigates Catastrophic Forgetting in Multilingual Expert Language Models

arXiv cs.CL · 3d ago Cached

This paper studies catastrophic forgetting in multilingual expert language models during continual pretraining and proposes five parameter alignment strategies (hard layer freezing, soft regularization, post-hoc weight reversion, and model merging) to mitigate forgetting across 32 training languages with minimal cost to language acquisition.

0 favorites 0 likes
← Back to home

Submit Feedback