rollout-sampling

Tag

#rollout-sampling

@SOURADIPCHAKR18: Typical RL algorithms and on-policy distillation methods are blind samplers: they use privileged info to score rollouts…

X AI KOLs Following ↗ · yesterday Cached

This work proposes using privileged information to actively sample rollouts in reinforcement learning, improving on typical blind sampling methods.

0 favorites 0 likes

← Back to home

Submit Feedback