Tag
This paper introduces RQIQN, a robust quantile-based method for distributional reinforcement learning that uses Wasserstein geometry regularization to prevent distribution degeneration and improve performance in risk-sensitive tasks.
This paper introduces C-DSAC, a new distributional reinforcement learning algorithm that uses the Cramér distance to improve performance and stability in robotic benchmarks compared to standard SAC.
This paper presents a finite-iteration theory for asynchronous categorical distributional temporal-difference learning, bridging the gap between existing theoretical frameworks and practical online implementations.