Learning to Communicate Locally for Large-Scale Multi-Agent Pathfinding
Summary
This paper introduces LC-MAPF, a pre-trained model with a learnable communication module for multi-agent pathfinding that improves coordination and outperforms existing learning-based solvers while maintaining scalability.
View Cached Full Text
Cached at: 05/15/26, 12:25 PM
Paper page - Learning to Communicate Locally for Large-Scale Multi-Agent Pathfinding
Source: https://huggingface.co/papers/2605.07637
Abstract
Multi-agent pathfinding solver enhanced with learnable communication module improves coordination and performance while maintaining scalability.
Multi-agent pathfinding(MAPF) is a widely used abstraction for multi-robot trajectory planning problems, where multiple homogeneous agents move simultaneously within a shared environment. Although solving MAPF optimally is NP-hard, scalable and efficient solvers are critical for real-world applications such as logistics and search-and-rescue. To this end, the research community has proposed various decentralized suboptimal MAPF solvers that leverage machine learning. Such methods frame MAPF (from a single agent perspective) as aDec-POMDPwhere at each time step an agent has to decide an action based on the local observation and typically solve the problem viareinforcement learningorimitation learning. We follow the same approach but additionally introduce a learnable communication module tailored to enhance cooperation between agents via efficientfeature sharing. We present the Local Communication forMulti-agent Pathfinding(LC-MAPF), a generalizablepre-trained modelthat appliesmulti-round communicationbetween neighboring agents to exchange information and improve their coordination. Our experiments show that the introduced method outperforms the existing learning-based MAPF solvers, including IL and RL-based approaches, across diverse metrics in a diverse range of (unseen) test scenarios. Remarkably, the introduced communication mechanism does not compromise LC-MAPF’s scalability, a common bottleneck for communication-based MAPF solvers.
View arXiv pageView PDFAdd to collection
Get this paper in your agent:
hf papers read 2605\.07637
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2605.07637 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2605.07637 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2605.07637 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
COAgents: Multi-Agent Framework to Learn and Navigate Routing Problems Search Space
COAgents is a cooperative multi-agent framework for solving Vehicle Routing Problems that models search as a graph, using specialized agents for node selection, move selection, and jumps to escape local minima. It achieves state-of-the-art results on CVRP and VRPTW benchmarks, reducing the gap to best-known solutions by up to 44% compared to prior learning-based methods.
Learning to cooperate, compete, and communicate
OpenAI presents research on multi-agent reinforcement learning environments where agents learn to cooperate, compete, and communicate. The paper introduces MADDPG (Multi-Agent DDPG), a centralized critic approach that enables agents to learn collaborative strategies and communication protocols more effectively than traditional decentralized methods.
Learning Transferable Topology Priors for Multi-Agent LLM Collaboration Across Domains
This paper proposes TopoPrior, a framework that learns transferable topology priors from offline reference collaboration graphs to generate initial topologies for multi-agent LLM collaboration across domains, significantly reducing online search overhead and token consumption.
Counterfactual Graph for Multi-Agent LLM Calibration
This paper introduces CAGE, a counterfactual graph-based method for calibrating multi-agent LLM systems, evaluating on benchmarks like TriviaQA and MMLU-Pro across various communication topologies. The method outperforms existing post-hoc and LLM-elicited calibration approaches.
MAP: A Map-then-Act Paradigm for Long-Horizon Interactive Agent Reasoning
The paper proposes the Map-then-Act Paradigm (MAP), a plug-and-play framework that shifts environmental understanding before execution in interactive LLM agents, achieving consistent gains across benchmarks and enabling frontier models to surpass near-zero baseline performance in 22 of 25 game environments.