rollout-sampling

Tag

Cards List
#rollout-sampling

@SOURADIPCHAKR18: Typical RL algorithms and on-policy distillation methods are blind samplers: they use privileged info to score rollouts…

X AI KOLs Following · yesterday Cached

This work proposes using privileged information to actively sample rollouts in reinforcement learning, improving on typical blind sampling methods.

0 favorites 0 likes
← Back to home

Submit Feedback