nested-subnetworks

Tag

Cards List
#nested-subnetworks

FlexMoE: One-for-All Nested Intra-Expert Pruning for MoE Language Models

arXiv cs.LG · 5d ago Cached

FlexMoE proposes a one-for-all nested intra-expert pruning method for MoE language models, enabling multiple deployable subnetworks from a single training run with minimal performance loss.

0 favorites 0 likes
← Back to home

Submit Feedback