activation-function

Tag

Cards List
#activation-function

DECO: Sparse Mixture-of-Experts with Dense-Comparable Performance on End-Side Devices

Hugging Face Daily Papers · 2d ago Cached

DECO is a sparse MoE architecture that matches dense Transformer performance with only 20% activated experts and a 3x acceleration kernel, utilizing ReLU-based routing, learnable scaling, and the NormSiLU activation function.

0 favorites 0 likes
#activation-function

Approximating Hyperbolic Tangent

Hacker News Top · 2026-04-22 Cached

Blog post surveys fast hyperbolic tangent approximations—Taylor, Padé, splines, and bit-level tricks—for neural-network and real-time audio use.

0 favorites 0 likes
← Back to home

Submit Feedback