Learning with opponent-learning awareness
Summary
OpenAI presents LOLA (Learning with Opponent-Learning Awareness), a multi-agent reinforcement learning method where agents shape the anticipated learning of other agents. The approach demonstrates emergence of cooperation in iterated prisoner's dilemma and convergence to Nash equilibrium in game-theoretic settings.
View Cached Full Text
Cached at: 04/20/26, 02:45 PM
Similar Articles
Learning to model other minds
OpenAI and University of Oxford researchers present LOLA (Learning with Opponent-Learning Awareness), a reinforcement learning method that enables agents to model and account for the learning of other agents, discovering cooperative strategies in multi-agent games like the iterated prisoner's dilemma and coin game.
Learning to cooperate, compete, and communicate
OpenAI presents research on multi-agent reinforcement learning environments where agents learn to cooperate, compete, and communicate. The paper introduces MADDPG (Multi-Agent DDPG), a centralized critic approach that enables agents to learn collaborative strategies and communication protocols more effectively than traditional decentralized methods.
Learning policy representations in multiagent systems
OpenAI researchers propose a general framework for learning representations of agent policies in multiagent systems using minimal interaction data, casting the problem as representation learning with applications to competitive control and cooperative communication environments.
Learning to communicate
OpenAI researchers demonstrate that cooperative AI agents can develop their own grounded and compositional language through reinforcement learning in simple worlds. The agents learn to communicate by being rewarded for achieving goals that require coordination, creating shared symbolic languages to coordinate behavior.
Preference Estimation via Opponent Modeling in Multi-Agent Negotiation
This paper proposes a novel preference estimation method that integrates natural language information from LLMs into a structured Bayesian opponent modeling framework for multi-agent negotiation. The approach leverages LLMs to extract qualitative cues from utterances and convert them into probabilistic formats, demonstrating improved agreement rates and preference estimation accuracy on multi-party negotiation benchmarks.