Tag
This paper presents a finite-iteration theory for asynchronous categorical distributional temporal-difference learning, bridging the gap between existing theoretical frameworks and practical online implementations.