contextual-bandits

Tag

Cards List
#contextual-bandits

Online Pandora's Box for Contextual LLM Cascading

arXiv cs.AI · 3d ago Cached

This paper introduces an online contextual Pandora's Box model for adaptively querying and selecting LLM APIs, proposing a learning approach that combines GMM estimation with UCB-style confidence bounds and proving dimension-dependent regret bounds.

0 favorites 0 likes
#contextual-bandits

Human-in-the-Loop Contextual Bandits for Short-Term Rental Dynamic Pricing: Structural Equivalence of Historical Warm-Up and Approval-Gated Live Learning

arXiv cs.LG · 2026-06-03 Cached

The paper introduces Human-in-the-Loop Gated Bandit (HITL-GB) for short-term rental dynamic pricing, showing that historical pricing data under a prior policy is structurally equivalent to on-policy warm-up data, reducing cold-start from ~150 to ~30 episodes.

0 favorites 0 likes
#contextual-bandits

Catching a Moving Subspace: Low-Rank Bandits Beyond Stationarity

arXiv cs.LG · 2026-05-21 Cached

This paper studies piecewise-stationary low-rank linear contextual bandits, proposes the SPSC algorithm that achieves dynamic regret scaling with the intrinsic rank instead of the ambient dimension, and characterizes the identification boundary for subspace recovery under scalar feedback.

0 favorites 0 likes
← Back to home

Submit Feedback