tabular-mdp

#tabular-mdp

Exploring Starts Are Not Enough: Counterexamples and a Fix for Monte Carlo Exploring Starts

arXiv cs.LG ↗ · 5d ago Cached

This paper presents counterexamples showing that Monte Carlo Exploring Starts can converge to suboptimal solutions in tabular reinforcement learning, and provides a modification that guarantees convergence to optimality by scaling learning rates inversely to update frequencies.

0 favorites 0 likes

tabular-mdp

Exploring Starts Are Not Enough: Counterexamples and a Fix for Monte Carlo Exploring Starts

Submit Feedback