learning-rates

#learning-rates

Exploring Starts Are Not Enough: Counterexamples and a Fix for Monte Carlo Exploring Starts

arXiv cs.LG ↗ · 5d ago Cached

This paper presents counterexamples showing that Monte Carlo Exploring Starts can converge to suboptimal solutions in tabular reinforcement learning, and provides a modification that guarantees convergence to optimality by scaling learning rates inversely to update frequencies.

0 favorites 0 likes

#learning-rates

Balancing Learning Rates Across Layers: Exact Two-Step Dynamics and Optimal Scaling in Linear Neural Networks

arXiv cs.LG ↗ · 2026-06-02 Cached

This paper derives exact closed-form expressions for gradients and test loss after one and two steps of gradient descent in two-layer and three-layer linear neural networks, characterizing optimal learning rate selection and revealing a distinct early-training regime where unequal layer-wise learning rates are initially optimal.

0 favorites 0 likes

learning-rates

Exploring Starts Are Not Enough: Counterexamples and a Fix for Monte Carlo Exploring Starts

Balancing Learning Rates Across Layers: Exact Two-Step Dynamics and Optimal Scaling in Linear Neural Networks

Submit Feedback