Tag
Researchers propose a novel router redesign for Mixture-of-Experts models that aligns router rows with principal singular directions using Manifold Power Iteration, improving model effectiveness.