Tag
This paper introduces variable-delay real-time RL, where agents decide how long to deliberate in environments that progress during decision-making, and proposes a lightweight gating policy to select state-dependent planning budgets, outperforming fixed-budget and heuristic baselines in several real-time games.