Tag
Proposes Generic TB-Coverage, a coverage-aware expert pruning method for sparse Mixture-of-Experts language models that uses only generic text corpora for calibration and preserves cross-corpus expert coverage, improving accuracy and reducing perplexity degradation.