marl

#marl

HyPOLE: Hyperproperty-Guided Multi-Agent Reinforcement Learning under Partial Observation

arXiv cs.AI ↗ · 2026-07-01 Cached

HyPOLE introduces a framework for multi-agent reinforcement learning under partial observability that uses hyperproperty-guided learning via HyperLTL temporal logic, integrated with centralized training for decentralized execution, and demonstrates improvements over baselines on SMAC, MessySMAC, and WildFire benchmarks.

0 favorites 0 likes

#marl

HiComm: Hierarchical Communication for Multi-agent Reinforcement Learning

arXiv cs.AI ↗ · 2026-06-30 Cached

HiComm is a plug-in communication module for cooperative multi-agent reinforcement learning that grounds messages in the sender's hierarchical observation structure, using a receiver-driven query and three-stage decoding to reduce communication volume by up to 23x.

0 favorites 0 likes

#marl

Metric-Gradient Projection for Stable Multi-Agent Policy Learning

arXiv cs.LG ↗ · 2026-05-20

Introduces HPML, a method that projects the joint update field of multi-agent systems onto a metric-gradient component to stabilize and improve multi-agent reinforcement learning. It provides theoretical guarantees and shows improved stability and returns on CTDE benchmarks.

0 favorites 0 likes

#marl

Macro-Action Based Multi-Agent Instruction Following through Value Cancellation

arXiv cs.AI ↗ · 2026-05-14 Cached

Proposes MAVIC, a method for multi-agent reinforcement learning that corrects value estimates at instruction boundaries to enable compliance with external natural language instructions while preserving base task performance.

0 favorites 0 likes

marl

HyPOLE: Hyperproperty-Guided Multi-Agent Reinforcement Learning under Partial Observation

HiComm: Hierarchical Communication for Multi-agent Reinforcement Learning

Metric-Gradient Projection for Stable Multi-Agent Policy Learning

Macro-Action Based Multi-Agent Instruction Following through Value Cancellation

Submit Feedback