Tag
FlexMoE proposes a one-for-all nested intra-expert pruning method for MoE language models, enabling multiple deployable subnetworks from a single training run with minimal performance loss.