PACEvolve++: Improving Test-time Learning for Evolutionary Search Agents
Summary
The paper introduces PACEvolve++, a reinforcement learning framework that improves test-time policy adaptation for evolutionary search agents by decoupling hypothesis generation from execution.
View Cached Full Text
Cached at: 05/12/26, 02:50 AM
Paper page - PACEvolve++: Improving Test-time Learning for Evolutionary Search Agents
Source: https://huggingface.co/papers/2605.07039 Authors:
,
,
,
,
,
,
,
,
,
,
,
,
Abstract
PACEvolve++ enables adaptive policy selection in evolutionary search through a reinforcement learning framework that decouples hypothesis generation from execution while adapting optimization strategies across evolutionary phases.
Large language models have become drivers ofevolutionary search, but most systems rely on a fixed, prompt-elicited policy to sample next candidates. This limits adaptation in practical engineering and research tasks, where evaluations are expensive, and progress depends on learning task-specific search dynamics. We introduce PACEvolve++, an advisor-modelreinforcement learningframework fortest-time policy adaptationinevolutionary searchagents. PACEvolve++ decouples strategic search decisions from implementation: a trainable advisor generates, assesses, and selects hypotheses, while a strongerfrontier modeltranslates selected hypotheses into executable candidates. To train the advisor under non-stationary feedback, we propose aphase-adaptive approachthat adapts its optimization strategy to different phases of the evolutionary process. Early in evolution, it usesgroup-relative feedbackto learn broad search preferences; later, as reward gaps compress, it emphasizesbest-of-kfrontier contribution to support stable refinement. Across expert-parallel load balancing, sequential recommendation, and protein fitness extrapolation, PACEvolve++ outperforms the state-of-the-artevolutionary searchframework withfrontier models, achieving fasterconvergenceand stabilizing test-time training duringevolutionary search.
View arXiv pageView PDFAdd to collection
Get this paper in your agent:
hf papers read 2605\.07039
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2605.07039 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2605.07039 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2605.07039 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
EVOCHAMBER: Test-Time Co-evolution of Multi-Agent System at Individual, Team, and Population Scales
EVOCHAMBER is a training-free, multi-agent test-time evolution framework that enables emergent specialization through collaborative reflection and asymmetric knowledge transfer across individual, team, and population scales, achieving significant improvements on math, code, and reasoning tasks.
EvoTest: Evolutionary Test-Time Learning for Self-Improving Agentic Systems
EvoTest introduces J-TTL, a benchmark for measuring agent test-time learning capabilities, and proposes an evolutionary framework where an Actor Agent plays games while an Evolver Agent iteratively improves the system's prompts, memory, and hyperparameters without fine-tuning. The method demonstrates superior performance compared to reflection and memory-based baselines on complex text-based games.
PACE: Two-Timescale Self-Evolution for Small Language Model Agents
PACE introduces a two-timescale framework for self-evolution of small language model agents, coordinating low-risk prompt refinement with higher-risk control-logic updates, achieving up to +9.2% relative improvement across benchmarks.
EvoTrainer: Co-Evolving LLM Policies and Training Harnesses for Autonomous Agentic Reinforcement Learning
EvoTrainer introduces an autonomous training framework that co-evolves LLM policies and training harnesses through empirical feedback, outperforming human-engineered RL baselines on mathematical reasoning, code generation, and long-horizon software engineering tasks.
EvoSci: A Bio-Inspired Multi-Agent Framework for the Evolution of Scientific Discovery
EvoSci proposes a bio-inspired multi-agent framework that integrates evolutionary algorithms with knowledge graph modeling to iteratively generate, evaluate, and refine research ideas, achieving top performance in peer-review evaluations.