Tag
A method for contract-based compositional shielding that ensures global safety in multi-agent reinforcement learning without centralized runtime control, using local LTL obligations and a multi-armed bandit to optimize team reward.