LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling
Summary
LoopCoder-v2 proposes Parallel Loop Transformers (PLT) for efficient test-time computation scaling in code generation, showing that two loops yield significant gains while more loops cause diminishing returns and positional mismatch costs.
View Cached Full Text
Cached at: 06/17/26, 03:35 AM
Paper page - LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling
Source: https://huggingface.co/papers/2606.18023 Authors:
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Abstract
Parallel loop Transformers achieve better code generation performance with two loops due to refined representations, while additional loops cause diminishing returns and increased positional mismatch costs.
Looped Transformersscale latent computation by repeatedly applying shared blocks, but sequential looping increases latency and KV-cache memory with the loop count.Parallel loop Transformers(PLT) alleviate this cost throughcross-loop position offsets(CLP) andshared-KV gated sliding-window attention, making loop count a practical design choice. We therefore study PLTloop-count selectionthrough a gain--cost view: an extra loop may refine representations, but CLP also introduces a positional mismatch at each loop boundary. We instantiate this study by trainingLoopCoder-v2, a family of 7B PLT coders with different loop counts, from scratch on 18T tokens, followed by matchedinstruction tuningand evaluation. Empirically, the two-loop variant delivers broad gains over the non-looped baseline across code generation, code reasoning, agentic software engineering, and tool-use benchmarks, improvingSWE-benchVerified from 43.0 to 64.4 points andMulti-SWEfrom 14.0 to 31.0 points. In contrast, variants with three or more loops regress, revealing a strongly non-monotonic loop-count effect. Our diagnostics show that loop 2 provides the main productive refinement, while later loops yield diminishing, oscillatory updates and reduced representational diversity. Because the CLP-induced mismatch remains roughly fixed as refinement gains shrink, the offset cost increasingly dominates. This gain--cost trade-off explains PLT’s saturation at two loops and provides diagnostics forloop-count selection.
View arXiv pageView PDFAdd to collection
Get this paper in your agent:
hf papers read 2606\.18023
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2606.18023 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2606.18023 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2606.18023 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
Multilingual-Multimodal-NLP/LoopCoder-V2 · Hugging Face
LoopCoder-V2 is a 7B instruction-tuned code model built on the Parallel Loop Transformer (PLT), demonstrating non-monotonic test-time scaling with two loops providing the best gain-cost trade-off and significant improvements over baselines on code generation and reasoning benchmarks.
@DorothyDDU: LoopCoder-v2 is out Loop Transformers reuse the same block for recurrent hidden-state refinement — letting models “thin…
This paper introduces LoopCoder-v2, a family of 7B parameter parallel loop transformers for code generation, and studies the optimal number of loops, finding that two loops yield significant gains while more loops cause degradation.
@rohanpaul_ai: Big claim in this paper, pushes against the common idea that more test-time compute should keep helping. Claims a code …
This paper introduces LoopCoder-v2, a 7B code model that benefits most from a single rethinking loop; additional loops degrade performance, challenging the assumption that more test-time compute always helps.
PaT: Planning-after-Trial for Efficient Test-Time Code Generation
This paper introduces PaT (Planning-after-Trial), an adaptive test-time computation strategy for code generation that reduces inference costs by approximately 69% while maintaining performance comparable to larger models.
Scaling Test-Time Compute for Agentic Coding
A test-time scaling framework for agentic coding that compresses rollout trajectories into structured summaries and uses recursive voting/PDR to boost Claude-4.5-Opus to 77.6% on SWE-Bench Verified.