Tag
A tweet from Souradip Chakraborty proposes using privileged information to actively sample rollouts in reinforcement learning, contrasting with traditional blind sampling methods. The tweet is prefaced by a quote about great teachers crafting demonstrations that students could build themselves.
Announcement of a research paper on Pedagogical RL, which proposes using privileged information to actively sample trajectories that RL algorithms typically miss.