Tag
This paper presents PushCen-ADFL, a communication-efficient asynchronous decentralized federated learning framework that uses centroid-based messaging and bias-correction to improve accuracy and reduce communication overhead under heterogeneous conditions.
SignMuon is a 1-bit, matrix-aware optimizer for distributed training that combines signSGD's majority-vote sign aggregation with Muon's polar-step framework, achieving 32x bandwidth reduction over float32 while maintaining strong convergence and performance on benchmarks like CIFAR-10/ResNet-50 and nanoGPT.
This paper presents a framework (CARE) that jointly learns control inputs and communication-efficient timing decisions under a pointwise Lyapunov safety shield, achieving higher inter-sample intervals than classical methods on inverted pendulum, cart-pole, and planar quadrotor systems.
The paper introduces an information-theoretic framework for communication-efficient expert routing in sparse mixture-of-experts models, treating the gate as a stochastic channel and deriving practical mutual information estimators to analyze accuracy-rate tradeoffs over finite expert banks.
Federation of Experts (FoE) restructures mixture-of-experts blocks into clusters that process KV heads independently, eliminating inter-node communication bottlenecks and improving inference throughput and latency by up to 5.2x while maintaining generation quality.