limited-adaptivity

Tag

Cards List
#limited-adaptivity

Contextual Slate GLM Bandits with Limited Adaptivity

arXiv cs.LG · 4h ago Cached

Proposes algorithms for contextual slate bandits with generalized linear rewards under limited adaptivity, achieving regret bounds independent of the non-linearity parameter. The batched and rarely-switching algorithms are computationally efficient and empirically outperform baselines, including in a language model example selection task.

0 favorites 0 likes
← Back to home

Submit Feedback