Decentralized Multi-Agent Systems with Shared Context
Summary
This paper introduces Decentralized Language Models (DeLM), a framework for multi-agent systems that uses parallel agents with a shared verified context to improve test-time scaling and reduce costs, achieving state-of-the-art results on SWE-bench Verified and LongBench-v2.
View Cached Full Text
Cached at: 06/10/26, 05:46 PM
Paper page - Decentralized Multi-Agent Systems with Shared Context
Source: https://huggingface.co/papers/2606.10662
Abstract
Decentralized Language Models (DeLM) framework enables scalable large language model reasoning through parallel agents that asynchronously coordinate via a shared verified context, improving performance and efficiency over centralized approaches.
Multi-agent systems(MAS) can scale large language model reasoning at test time by decomposing complex problems intoparallel subtasks. However, most existing MAS rely on centralized orchestration, where a main agent assigns work, collects outputs, and merges results. As the number of subtasks grows, this controller becomes a communication and integration bottleneck. We propose Decentralized Language Models (DeLM), a MAS framework that decentralizes coordination through parallel agents, ashared verified context, and atask queue. Agents asynchronously claim subtasks, read accumulated progress, perform local reasoning, and write back compact verified updates. The shared context acts as a common communication substrate, enabling agents to build on one another’s verified progress without routing every update through a central controller. Empirically, DeLM improves bothsoftware-engineeringtest-time scalingandlong-context reasoning. OnSWE-bench Verified, DeLM achieves the best performance across Avg.@1, Pass@2, and Pass@4, with gains of up to 10.5 percentage points over the strongest baseline, while reducing cost per task by roughly 50%. OnLongBench-v2Multi-Doc QA, DeLM achieves the highest average accuracy across four frontier model families, improving over the strongest baseline by up to 5.7 percentage points. The code is available on our project website at https://yuzhenmao.github.io/DeLM/.
View arXiv pageView PDFProject pageGitHub1Add to collection
Get this paper in your agent:
hf papers read 2606\.10662
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2606.10662 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2606.10662 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2606.10662 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
SMAC-Talk: A Natural Language Extension of the StarCraft Multi-Agent Challenge for Large Language Models
SMAC-Talk is a new benchmark that extends the StarCraft Multi-Agent Challenge to evaluate LLM-based agents in cooperative multi-agent environments with natural language communication. It includes scenarios with deceptive communicators and benchmarks agents using models from the Qwen3.5 family to study how reasoning, memory, and scale affect coordination.
GenericAgent: A Token-Efficient Self-Evolving LLM Agent via Contextual Information Density Maximization (V1.0)
This paper introduces GenericAgent, a self-evolving LLM agent system designed to maximize context information density. It addresses long-horizon limitations through hierarchical memory, reusable SOPs, and efficient compression, achieving better performance with fewer tokens compared to leading agents.
Learning Agent-Compatible Context Management for Long-Horizon Tasks
Introduces AdaCoM, an external LLM-based context manager for frozen agents, using reinforcement learning to improve long-horizon task performance by preserving task constraints and pruning stale content, with experiments on web search and deep research benchmarks.
TMAS: Scaling Test-Time Compute via Multi-Agent Synergy
TMAS introduces a multi-agent framework that enhances large language model reasoning by scaling test-time compute through structured collaboration and hierarchical memory systems. The approach uses specialized agents, cross-trajectory information flow, and hybrid reward reinforcement learning to improve iterative scaling and stability on challenging reasoning benchmarks.
Less Context, Better Agents: Efficient Context Engineering for Long-Horizon Tool-Using LLM Agents
This paper evaluates context engineering configurations for LLM agents in enterprise tool-use workflows, showing that summarization with selective pruning achieves 91.6% accuracy while reducing token usage by over 60% compared to full-context baselines.