hierarchical-thompson-sampling

Tag

Cards List
#hierarchical-thompson-sampling

Human-in-the-Loop Contextual Bandits for Short-Term Rental Dynamic Pricing: Structural Equivalence of Historical Warm-Up and Approval-Gated Live Learning

arXiv cs.LG · 2026-06-03 Cached

The paper introduces Human-in-the-Loop Gated Bandit (HITL-GB) for short-term rental dynamic pricing, showing that historical pricing data under a prior policy is structurally equivalent to on-policy warm-up data, reducing cold-start from ~150 to ~30 episodes.

0 favorites 0 likes
← Back to home

Submit Feedback