Tag
CollectionLoRA proposes a multi-teacher on-policy distillation framework that distills up to 50 different effect LoRAs into a single LoRA, reducing deployment overhead while maintaining concept fidelity.
Mix-MoE proposes a mixed Mixture-of-Experts framework with specialized expert groups and Fourier-transform-enhanced routing to mitigate parameter interference in multilingual machine translation, achieving significant improvements over baselines.