Puzzling Success of Overparameterization: Lottery Tickets or Escape Dimensions?
Summary
A paper investigating the reasons behind the success of overparameterization in neural networks, comparing the lottery ticket hypothesis with escape dimensions.
Similar Articles
Feature Lottery? A Bifurcation Theory of Concept Emergence
This paper introduces a bifurcation theory of representation dynamics to detect when neural networks acquire structured representations during training, using a Hessian analysis of a GMM probe. The resulting ratio β/β_c serves as a label-free phase coordinate that predicts the onset of usable structure and can forecast feature interpretability in sparse autoencoders early in training.
Better exploration with parameter noise
OpenAI presents parameter noise, a technique that adds adaptive noise to neural network policy parameters rather than action spaces, enabling agents to learn tasks significantly faster than traditional action noise approaches. The method achieves 2x faster learning on HalfCheetah and represents a middle ground between evolution strategies and deep RL approaches like TRPO and DDPG.
@ChrisGPotts: We take for granted that larger models are better than smaller ones, but why is this so? Our new paper, led by Jing Hua…
This paper investigates why larger models outperform smaller ones, attributing it to data-induced competition for neural resources through formal analysis and experiments.
Mitigating the Curse of Dimensionality in Uniform Convergence of Deep Neural Networks via Smooth Activations
This paper establishes a theoretical framework showing that smooth activations in deep neural networks can mitigate the curse of dimensionality in uniform convergence, providing non-asymptotic guarantees and outperforming ReLU networks in worst-case reliability.
Interpreting Neural Combinatorial Optimization via Evolving Programmatic Bottlenecks
Introduces Evolving Programmatic Bottlenecks (EPB), a framework for interpreting neural combinatorial optimization policies by distilling black-box models into human-readable program portfolios using LLM-guided evolution.