sparse-mixture-of-experts

#sparse-mixture-of-experts

silx-ai/Quasar-Preview

Hugging Face Models Trending ↗ · 2026-06-08 Cached

SILX AI releases Quasar-Preview, an 18B parameter MoE foundation model with 2B active parameters and experimental 5M-token context, built on a hybrid recurrent/attention architecture and designed for decentralized training via Bittensor SN24.

0 favorites 0 likes

#sparse-mixture-of-experts

HodgeCover: Higher-Order Topological Coverage Drives Compression of Sparse Mixture-of-Experts

arXiv cs.LG ↗ · 2026-05-15 Cached

HodgeCover uses higher-order topological coverage to compress sparse Mixture-of-Experts layers by addressing irreducible mergeability barriers that pairwise signals miss, matching state-of-the-art baselines on expert reduction and leading on aggressive compression.

0 favorites 0 likes

sparse-mixture-of-experts

silx-ai/Quasar-Preview

HodgeCover: Higher-Order Topological Coverage Drives Compression of Sparse Mixture-of-Experts

Submit Feedback