Tag
This paper theoretically demonstrates that two-layer neural networks trained on group composition tasks learn spectral representations, with neurons converging to irreducible representations and achieving rotational rank-one alignment, providing a representation-theoretic account of feature learning.
DeepMDMD combines deep learning with algebraic constraints to learn compact, dynamically coherent Koopman operator representations that enforce the product rule as an exact constraint. The method outperforms geometric approaches on high-dimensional chaotic and fluid dynamics problems, reducing spectral pollution and enabling stable long-term forecasting.
This paper proposes Energy-Gated Attention (EGA) and Morlet Positional Encoding (MoPE) to address missing inductive biases in transformer attention: token salience and scale-adaptive locality. Experiments on TinyShakespeare show superadditive gains when combined, highlighting complementarity.
This paper proposes DG-Hard, a post-hoc spectral repair method that recovers capabilities damaged by fine-tuning without retraining, using only the pretrained and fine-tuned checkpoints. It applies Donoho-Gavish hard singular-value thresholding to weight updates to remove noise and restore degraded performance.
Spectral Tempering (SpecTemp) proposes a learning-free method for embedding compression in dense passage retrieval that adaptively determines optimal spectral scaling based on signal-to-noise ratio analysis, outperforming fixed hyperparameter approaches like PCA and whitening.