head-level-hybridization

Tag

Cards List
#head-level-hybridization

HydraHead: From Head-Level Functional Heterogeneity to Specialized Attention Hybridization

Hugging Face Daily Papers · 2026-06-18 Cached

HydraHead is a novel attention hybridization architecture that combines Full and Linear Attention at the head level, achieving superior long-context performance with reduced training overhead via interpretability-driven selection and scale-normalized fusion.

0 favorites 0 likes
← Back to home

Submit Feedback