active-parameters

#active-parameters

Sigma-Branch: Hierarchical Single-Path Network Reconstruction for Dynamic Inference with Reduced Active Parameters

arXiv cs.LG ↗ · yesterday Cached

Sigma-Branch restructures pretrained dense networks into a hierarchical binary tree with a shared backbone, routers, and specialized leaves, reducing per-inference active parameters by 58–60% while staying within 1.72 pp of baseline accuracy on CIFAR-100, ImageNet-1K, and ModelNet40.

0 favorites 0 likes

#active-parameters

Is there a limit on the number of active parameters in an MoE model?

Reddit r/LocalLLaMA ↗ · 2026-05-14

Discussion on the limit of active parameters in Mixture-of-Experts (MoE) models, questioning whether there is a cap on active parameter count beyond which quality doesn't improve.

0 favorites 0 likes

active-parameters

Sigma-Branch: Hierarchical Single-Path Network Reconstruction for Dynamic Inference with Reduced Active Parameters

Is there a limit on the number of active parameters in an MoE model?

Submit Feedback