sparse-mixture-of-experts

#sparse-mixture-of-experts

silx-ai/Quasar-Preview

Hugging Face Models Trending ↗ · 2026-06-08 缓存

SILX AI 发布 Quasar-Preview，这是一个 18B 参数 MoE 基础模型，具有 2B 活跃参数和实验性的 5M token 上下文，基于混合循环/注意力架构构建，并设计用于通过 Bittensor SN24 进行去中心化训练。

0 人收藏 0 人点赞

#sparse-mixture-of-experts

arXiv cs.LG ↗ · 2026-05-15 缓存

HodgeCover 使用高阶拓扑覆盖来压缩稀疏混合专家层，通过解决成对信号遗漏的不可归约可合并性障碍，在专家缩减方面匹配最先进的基线，并在激进压缩方面领先。

0 人收藏 0 人点赞