Tag
The paper introduces Human-in-the-Loop Gated Bandit (HITL-GB) for short-term rental dynamic pricing, showing that historical pricing data under a prior policy is structurally equivalent to on-policy warm-up data, reducing cold-start from ~150 to ~30 episodes.