Tag
OpenAI demonstrates that agents trained in a hide-and-seek environment discover six distinct emergent strategies and tool-use behaviors through multi-agent competition, without explicit incentives for object interaction. This work suggests multi-agent co-adaptation can produce complex intelligent behavior through self-supervised learning.
OpenAI demonstrates that competitive self-play in simulated 3D robot environments enables AI agents to discover complex physical behaviors like tackling, ducking, and faking without explicit instruction, suggesting self-play will be fundamental to future powerful AI systems.
OpenAI presents research on multi-agent reinforcement learning environments where agents learn to cooperate, compete, and communicate. The paper introduces MADDPG (Multi-Agent DDPG), a centralized critic approach that enables agents to learn collaborative strategies and communication protocols more effectively than traditional decentralized methods.
OpenAI researchers demonstrate that cooperative AI agents can develop their own grounded and compositional language through reinforcement learning in simple worlds. The agents learn to communicate by being rewarded for achieving goals that require coordination, creating shared symbolic languages to coordinate behavior.