Tag
This paper introduces Decentralized Language Models (DeLM), a framework for multi-agent systems that uses parallel agents with a shared verified context to improve test-time scaling and reduce costs, achieving state-of-the-art results on SWE-bench Verified and LongBench-v2.
LongTraceRL introduces tiered distractor construction and rubric reward design to improve long-context reasoning in language models using reinforcement learning. The method generates multi-hop questions via knowledge graph random walks and uses search agent trajectories to build challenging distractors, with a rubric reward providing entity-level process supervision.
MemReread introduces a method for long-context reasoning that avoids intermediate retrieval by decomposing questions and rereading text to recover discarded information, achieving linear time complexity. It outperforms baseline frameworks on long-context reasoning tasks.