rarely-switching

#rarely-switching

Contextual Slate GLM Bandits with Limited Adaptivity

arXiv cs.LG ↗ · 13h ago Cached

Proposes algorithms for contextual slate bandits with generalized linear rewards under limited adaptivity, achieving regret bounds independent of the non-linearity parameter. The batched and rarely-switching algorithms are computationally efficient and empirically outperform baselines, including in a language model example selection task.

0 favorites 0 likes

rarely-switching

Contextual Slate GLM Bandits with Limited Adaptivity

Submit Feedback