Tag
A debugger that detects reward hacking in reinforcement learning reward functions during training, aiding developers in identifying and fixing issues.
OpenAI presents a large-scale empirical study of curiosity-driven reinforcement learning without extrinsic rewards across 54 benchmark environments, showing strong performance and investigating the role of feature spaces in prediction-based reward signals.