Learning policy representations in multiagent systems

OpenAI Blog 06/17/18, 07:00 AM Papers

Summary

OpenAI researchers propose a general framework for learning representations of agent policies in multiagent systems using minimal interaction data, casting the problem as representation learning with applications to competitive control and cooperative communication environments.

No content available

Original Article

View Cached Full Text

Cached at: 04/20/26, 02:46 PM

# Learning policy representations in multiagent systems Source: [https://openai.com/index/learning-policy-representations-in-multiagent-systems/](https://openai.com/index/learning-policy-representations-in-multiagent-systems/) OpenAI## Abstract Modeling agent behavior is central to understanding the emergence of complex phenomena in multiagent systems\. Prior work in agent modeling has largely been task\-specific and driven by hand\-engineering domain\-specific prior knowledge\. We propose a general learning framework for modeling agent behavior in any multiagent system using only a handful of interaction data\. Our framework casts agent modeling as a representation learning problem\. Consequently, we construct a novel objective inspired by imitation learning and agent identification and design an algorithm for unsupervised learning of representations of agent policies\. We demonstrate empirically the utility of the proposed framework in \(i\) a challenging high\-dimensional competitive environment for continuous control and \(ii\) a cooperative environment for communication, on supervised predictive tasks, unsupervised clustering, and policy optimization using deep reinforcement learning\.

Similar Articles

Learning to cooperate, compete, and communicate

OpenAI Blog

OpenAI presents research on multi-agent reinforcement learning environments where agents learn to cooperate, compete, and communicate. The paper introduces MADDPG (Multi-Agent DDPG), a centralized critic approach that enables agents to learn collaborative strategies and communication protocols more effectively than traditional decentralized methods.

Learning to communicate

OpenAI Blog

OpenAI researchers demonstrate that cooperative AI agents can develop their own grounded and compositional language through reinforcement learning in simple worlds. The agents learn to communicate by being rewarded for achieving goals that require coordination, creating shared symbolic languages to coordinate behavior.

Learning Agentic Policy from Action Guidance

arXiv cs.CL

The paper proposes ActGuide-RL, a method for training agentic policies in LLMs by using human action data as guidance to overcome exploration barriers in reinforcement learning without extensive supervised fine-tuning.

When Does Multi-Agent RL Improve LLM Workflows? Workflow, Scale, and Policy-Sharing Tradeoffs

arXiv cs.AI

This paper studies when end-to-end reinforcement learning training improves multi-agent LLM workflows, comparing shared-policy and isolated-policy training across different workflows, tasks, and model scales, revealing conditional tradeoffs.

NeuroMAS: Multi-Agent Systems as Neural Networks with Joint Reinforcement Learning