Tag
Proposes R2R2, a regularization method for self-predictive learning in reinforcement learning to mitigate overfitting under high update-to-data ratios, achieving significant improvements on continuous control tasks.