bandit

#bandit

Distributed Online Bandit Submodular Maximization with Bounded Sampling Violations

arXiv cs.LG ↗ · 23h ago Cached

This paper presents a unified algorithmic framework for distributed online submodular maximization under partition matroid constraints, achieving sublinear (1-1/e)-regret guarantees for both full-information and bandit feedback. It also introduces a bounded stochastic pipage rounding scheme to ensure cumulative sampling violations remain sublinear.

0 favorites 0 likes

#bandit

ALSO: Adversarial Online Strategy Optimization for Social Agents

arXiv cs.AI ↗ · 2026-05-18 Cached

ALSO introduces a framework for online strategy optimization in multi-agent social simulation, formulating multi-turn interaction as an adversarial bandit problem and using a neural surrogate for reward prediction. Experiments on the Sotopia benchmark show it outperforms static baselines and existing optimization methods.

0 favorites 0 likes

bandit

Distributed Online Bandit Submodular Maximization with Bounded Sampling Violations

ALSO: Adversarial Online Strategy Optimization for Social Agents

Submit Feedback